Remote Sensing

Research

26 pages, 7207 KB

Open AccessArticle

MobileSAM-Track: Lightweight One-Shot Tracking and Segmentation of Small Objects on Edge Devices

by Yehui Liu, Yuliang Zhao, Xinyue Zhang, Xiaoai Wang, Chao Lian, Jian Li, Peng Shan, Changzeng Fu, Xiaoyong Lyu, Lianjiang Li, Qiang Fu and Wen Jung Li

Remote Sens. 2023, 15(24), 5665; https://doi.org/10.3390/rs15245665 - 7 Dec 2023

Cited by 11 | Viewed by 4285

Abstract

Tracking and segmenting small targets in remote sensing videos on edge devices carries significant engineering implications. However, many semi-supervised video object segmentation (S-VOS) methods heavily rely on extensive video random-access memory (VRAM) resources, making deployment on edge devices challenging. Our goal is to [...] Read more.

Tracking and segmenting small targets in remote sensing videos on edge devices carries significant engineering implications. However, many semi-supervised video object segmentation (S-VOS) methods heavily rely on extensive video random-access memory (VRAM) resources, making deployment on edge devices challenging. Our goal is to develop an edge-deployable S-VOS method that can achieve high-precision tracking and segmentation by selecting a bounding box for the target object. First, a tracker is introduced to pinpoint the position of the tracked object in different frames, thereby eliminating the need to save the results of the split as other S-VOS methods do, thus avoiding an increase in VRAM usage. Second, we use two key lightweight components, correlation filters (CFs) and the Mobile Segment Anything Model (MobileSAM), to ensure the inference speed of our model. Third, a mask diffusion module is proposed that improves the accuracy and robustness of segmentation without increasing VRAM usage. We use our self-built dataset containing airplanes and vehicles to evaluate our method. The results show that on the GTX 1080 Ti, our model achieves a J&F score of 66.4% under the condition that the VRAM usage is less than 500 MB, while maintaining a processing speed of 12 frames per second (FPS). The model we propose exhibits good performance in tracking and segmenting small targets on edge devices, providing a solution for fields such as aircraft monitoring and vehicle tracking that require executing S-VOS tasks on edge devices. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II)

► Show Figures

Figure 1

18 pages, 3670 KB

Open AccessArticle

ERF-RTMDet: An Improved Small Object Detection Method in Remote Sensing Images

by Shuo Liu, Huanxin Zou, Yazhe Huang, Xu Cao, Shitian He, Meilin Li and Yuqing Zhang

Remote Sens. 2023, 15(23), 5575; https://doi.org/10.3390/rs15235575 - 30 Nov 2023

Cited by 9 | Viewed by 4788

Abstract

A significant challenge in detecting objects in complex remote sensing (RS) datasets is from small objects. Existing detection methods achieve much lower accuracy on small objects than medium and large ones. These methods suffer from limited feature information, susceptibility to complex background interferences, [...] Read more.

A significant challenge in detecting objects in complex remote sensing (RS) datasets is from small objects. Existing detection methods achieve much lower accuracy on small objects than medium and large ones. These methods suffer from limited feature information, susceptibility to complex background interferences, and insufficient contextual information. To address these issues, a small object detection method with the enhanced receptive field, ERF-RTMDet, is proposed to achieve a more robust detection capability on small objects in RS images. Specifically, three modules are employed to enhance the receptive field of small objects’ features. First, the Dilated Spatial Pyramid Pooling Fast Module is proposed to gather more contextual information on small objects and suppress the interference of background information. Second, the Content-Aware Reassembly of Features Module is employed for more efficient feature fusion instead of the nearest-neighbor upsampling operator. Finally, the Hybrid Dilated Attention Module is proposed to expand the receptive field of object features after the feature fusion network. Extensive experiments are conducted on the MAR20 and NWPU VHR-10 datasets. The experimental results show that our ERF-RTMDet attains higher detection precision on small objects while maintaining or slightly enhancing the detection precision on mid-scale and large-scale objects. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II)

► Show Figures

Graphical abstract

21 pages, 6111 KB

Open AccessArticle

A Novel Adaptive Edge Aggregation and Multiscale Feature Interaction Detector for Object Detection in Remote Sensing Images

by Wei Huang, Yuhao Zhao, Le Sun, Lu Gao and Yuwen Chen

Remote Sens. 2023, 15(21), 5200; https://doi.org/10.3390/rs15215200 - 1 Nov 2023

Cited by 2 | Viewed by 2280

Abstract

Object detection (OD) in remote sensing (RS) images is an important task in the field of computer vision. OD techniques have achieved impressive advances in recent years. However, complex background interference, large-scale variations, and dense instances pose significant challenges for OD. These challenges [...] Read more.

Object detection (OD) in remote sensing (RS) images is an important task in the field of computer vision. OD techniques have achieved impressive advances in recent years. However, complex background interference, large-scale variations, and dense instances pose significant challenges for OD. These challenges may lead to misalignment between features extracted by OD models and the features of real objects. To address these challenges, we explore a novel single-stage detection framework for the adaptive fusion of multiscale features and propose a novel adaptive edge aggregation and multiscale feature interaction detector (AEAMFI-Det) for OD in RS images. AEAMFI-Det consists of an adaptive edge aggregation (AEA) module, a feature enhancement module (FEM) embedded in a context-aware cross-attention feature pyramid network (2CA-FPN), and a pyramid squeeze attention (PSA) module. The AEA module employs an edge enhancement mechanism to guide the network to learn spatial multiscale nonlocal dependencies and solve the problem of feature misalignment between the network’s focus and the real object. The 2CA-FPN employs level-by-level feature fusion to enhance multiscale feature interactions and effectively mitigate the misalignment between the scales of the extracted features and the scales of real objects. The FEM is designed to capture the local and nonlocal contexts as auxiliary information to enhance the feature representation of information interaction between multiscale features in a cross-attention manner. We introduce the PSA module to establish long-term dependencies between multiscale spaces and channels for better interdependency refinement. Experimental results obtained using the NWPU VHR-10 and DIOR datasets demonstrate the superior performance of AEAMFI-Det in object classification and localization. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II)

► Show Figures

Figure 1

16 pages, 3981 KB

Open AccessArticle

Transformer in UAV Image-Based Weed Mapping

by Jiangsan Zhao, Therese With Berge and Jakob Geipel

Remote Sens. 2023, 15(21), 5165; https://doi.org/10.3390/rs15215165 - 29 Oct 2023

Cited by 7 | Viewed by 3172

Abstract

Weeds affect crop yield and quality due to competition for resources. In order to reduce the risk of yield losses due to weeds, herbicides or non-chemical measures are applied. Weeds, especially creeping perennial species, are generally distributed in patches within arable fields. Hence, [...] Read more.

Weeds affect crop yield and quality due to competition for resources. In order to reduce the risk of yield losses due to weeds, herbicides or non-chemical measures are applied. Weeds, especially creeping perennial species, are generally distributed in patches within arable fields. Hence, instead of applying control measures uniformly, precision weeding or site-specific weed management (SSWM) is highly recommended. Unmanned aerial vehicle (UAV) imaging is known for wide area coverage and flexible operation frequency, making it a potential solution to generate weed maps at a reasonable cost. Efficient weed mapping algorithms need to be developed together with UAV imagery to facilitate SSWM. Different machine learning (ML) approaches have been developed for image-based weed mapping, either classical ML models or the more up-to-date deep learning (DL) models taking full advantage of parallel computation on a GPU (graphics processing unit). Attention-based transformer DL models, which have seen a recent boom, are expected to overtake classical convolutional neural network (CNN) DL models. This inspired us to develop a transformer DL model for segmenting weeds, cereal crops, and ‘other’ in low-resolution RGB UAV imagery (about 33 mm ground sampling distance, g.s.d.) captured after the cereal crop had turned yellow. Images were acquired during three years in 15 fields with three cereal species (Triticum aestivum, Hordeum vulgare, and Avena sativa) and various weed flora dominated by creeping perennials (mainly Cirsium arvense and Elymus repens). The performance of our transformer model, 1Dtransformer, was evaluated through comparison with a classical DL model, 1DCNN, and two classical ML methods, i.e., random forest (RF) and k-nearest neighbor (KNN). The transformer model showed the best performance with an overall accuracy of 98.694% on pixels set aside for validation. It also agreed best and relatively well with ground reference data on total weed coverage, R2 = 0.598. In this study, we showed the outstanding performance and robustness of a 1Dtransformer model for weed mapping based on UAV imagery for the first time. The model can be used to obtain weed maps in cereals fields known to be infested by perennial weeds. These maps can be used as basis for the generation of prescription maps for SSWM, either pre-harvest, post-harvest, or in the next crop, by applying herbicides or non-chemical measures. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II)

► Show Figures

Figure 1

25 pages, 10723 KB

Open AccessArticle

CDAU-Net: A Novel CoordConv-Integrated Deep Dual Cross Attention Mechanism for Enhanced Road Extraction in Remote Sensing Imagery

by Anchao Yin, Chao Ren, Weiting Yue, Hongjuan Shao and Xiaoqin Xue

Remote Sens. 2023, 15(20), 4914; https://doi.org/10.3390/rs15204914 - 11 Oct 2023

Viewed by 2427

Abstract

In the realm of remote sensing image analysis, the task of road extraction poses significant complexities, especially in the context of intricate scenes and diminutive targets. In response to these challenges, we have developed a novel deep learning network, christened CDAU-Net, designed to [...] Read more.

In the realm of remote sensing image analysis, the task of road extraction poses significant complexities, especially in the context of intricate scenes and diminutive targets. In response to these challenges, we have developed a novel deep learning network, christened CDAU-Net, designed to discern and delineate these features with enhanced precision. This network takes its structural inspiration from the fundamental architecture of U-Net while introducing innovative enhancements: we have integrated CoordConv convolutions into both the initial layer of the U-Net encoder and the terminal layer of the decoder, thereby facilitating a more efficacious processing of spatial information inherent in remote sensing images. Moreover, we have devised a unique mechanism termed the Deep Dual Cross Attention (DDCA), purposed to capture long-range dependencies within images—a critical factor in remote sensing image analysis. Our network replaces the skip-connection component of the U-Net with this newly designed mechanism, dealing with feature maps of the first four scales in the encoder and generating four corresponding outputs. These outputs are subsequently linked with the decoder stage to further capture the remote dependencies present within the remote sensing imagery. We have subjected CDAU-Net to extensive empirical validation, including testing on the Massachusetts Road Dataset and DeepGlobe Road Dataset. Both datasets encompass a diverse range of complex road scenes, making them ideal for evaluating the performance of road extraction algorithms. The experimental results showcase that whether in terms of accuracy, recall rate, or Intersection over Union (IoU) metrics, the CDAU-Net outperforms existing state-of-the-art methods in the task of road extraction. These findings substantiate the effectiveness and superiority of our approach in handling complex scenes and small targets, as well as in capturing long-range dependencies in remote sensing imagery. In sum, the design of CDAU-Net not only enhances the accuracy of road extraction but also presents new perspectives and possibilities for deep learning analysis of remote sensing imagery. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II)

► Show Figures

Figure 1

18 pages, 2869 KB

Open AccessArticle

An Enhanced Target Detection Algorithm for Maritime Search and Rescue Based on Aerial Images

by Yijian Zhang, Yong Yin and Zeyuan Shao

Remote Sens. 2023, 15(19), 4818; https://doi.org/10.3390/rs15194818 - 3 Oct 2023

Cited by 36 | Viewed by 4607

Abstract

Unmanned aerial vehicles (UAVs), renowned for their rapid deployment, extensive data collection, and high spatial resolution, are crucial in locating distressed individuals during search and rescue (SAR) operations. Challenges in maritime search and rescue include missed detections due to issues including sunlight reflection. [...] Read more.

Unmanned aerial vehicles (UAVs), renowned for their rapid deployment, extensive data collection, and high spatial resolution, are crucial in locating distressed individuals during search and rescue (SAR) operations. Challenges in maritime search and rescue include missed detections due to issues including sunlight reflection. In this study, we proposed an enhanced ABT-YOLOv7 algorithm for underwater person detection. This algorithm integrates an asymptotic feature pyramid network (AFPN) to preserve the target feature information. The BiFormer module enhances the model’s perception of small-scale targets, whereas the task-specific context decoupling (TSCODE) mechanism effectively resolves conflicts between localization and classification. Using quantitative experiments on a curated dataset, our model outperformed methods such as YOLOv3, YOLOv4, YOLOv5, YOLOv8, Faster R-CNN, Cascade R-CNN, and FCOS. Compared with YOLOv7, our approach enhances the mean average precision (mAP) from 87.1% to 91.6%. Therefore, our approach reduces the sensitivity of the detection model to low-lighting conditions and sunlight reflection, thus demonstrating enhanced robustness. These innovations have driven advancements in UAV technology within the maritime search and rescue domains. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II)

► Show Figures

Figure 1

22 pages, 3244 KB

Open AccessArticle

Leveraging High-Resolution Long-Wave Infrared Hyperspectral Laboratory Imaging Data for Mineral Identification Using Machine Learning Methods

by Alireza Hamedianfar, Kati Laakso, Maarit Middleton, Tuomo Törmänen, Juha Köykkä and Johanna Torppa

Remote Sens. 2023, 15(19), 4806; https://doi.org/10.3390/rs15194806 - 3 Oct 2023

Cited by 10 | Viewed by 4018

Abstract

Laboratory-based hyperspectral imaging (HSI) is an optical non-destructive technology used to extract mineralogical information from bedrock drill cores. In the present study, drill core scanning in the long-wave infrared (LWIR; 8000–12,000 nm) wavelength region was used to map the dominant minerals in HSI [...] Read more.

Laboratory-based hyperspectral imaging (HSI) is an optical non-destructive technology used to extract mineralogical information from bedrock drill cores. In the present study, drill core scanning in the long-wave infrared (LWIR; 8000–12,000 nm) wavelength region was used to map the dominant minerals in HSI pixels. Machine learning classification algorithms, including random forest (RF) and support vector machine, have previously been applied to the mineral characterization of drill core hyperspectral data. The objectives of this study are to expand semi-automated mineral mapping by investigating the mapping accuracy, generalization potential, and classification ability of cutting-edge methods, such as various ensemble machine learning algorithms and deep learning semantic segmentation. In the present study, the mapping of quartz, talc, chlorite, and mixtures thereof in HSI data was performed using the ENVINet5 algorithm, which is based on the U-net deep learning network and four decision tree ensemble algorithms, including RF, gradient-boosting decision tree (GBDT), light gradient-boosting machine (LightGBM), AdaBoost, and bagging. Prior to training the classification models, endmember selection was employed using the Sequential Maximum Angle Convex Cone endmember extraction method to prepare the samples used in the model training and evaluation of the classification results. The results show that the GBDT and LightGBM classifiers outperformed the other classification models with overall accuracies of 89.43% and 89.22%, respectively. The results of the other classifiers showed overall accuracies of 87.32%, 87.33%, 82.74%, and 78.32% for RF, bagging, ENVINet5, and AdaBoost, respectively. Therefore, the findings of this study confirm that the ensemble machine learning algorithms are efficient tools to analyze drill core HSI data and map dominant minerals. Moreover, the implementation of deep learning methods for mineral mapping from HSI drill core data should be further explored and adjusted. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II)

► Show Figures

Graphical abstract

21 pages, 5500 KB

Open AccessArticle

Unified Transformer with Cross-Modal Mixture Experts for Remote-Sensing Visual Question Answering

by Gang Liu, Jinlong He, Pengfei Li, Shenjun Zhong, Hongyang Li and Genrong He

Remote Sens. 2023, 15(19), 4682; https://doi.org/10.3390/rs15194682 - 24 Sep 2023

Cited by 7 | Viewed by 3421

Abstract

Remote-sensing visual question answering (RSVQA) aims to provide accurate answers to remote sensing images and their associated questions by leveraging both visual and textual information during the inference process. However, most existing methods ignore the significance of the interaction between visual and language [...] Read more.

Remote-sensing visual question answering (RSVQA) aims to provide accurate answers to remote sensing images and their associated questions by leveraging both visual and textual information during the inference process. However, most existing methods ignore the significance of the interaction between visual and language features, which typically adopt simple feature fusion strategies and fail to adequately model cross-modal attention, struggling to capture the complex semantic relationships between questions and images. In this study, we introduce a unified transformer with cross-modal mixture expert (TCMME) model to address the RSVQA problem. Specifically, we utilize the vision transformer (VIT) and BERT to extract visual and language features, respectively. Furthermore, we incorporate cross-modal mixture experts (CMMEs) to facilitate cross-modal representation learning. By leveraging the shared self-attention and cross-modal attention within CMMEs, as well as the modality experts, we effectively capture the intricate interactions between visual and language features and better focus on their complex semantic relationships. Finally, we conduct qualitative and quantitative experiments on two benchmark datasets: RSVQA-LR and RSVQA-HR. The results demonstrate that our proposed method surpasses the current state-of-the-art (SOTA) techniques. Additionally, we perform an extensive analysis to validate the effectiveness of different components in our framework. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II)

► Show Figures

Figure 1

23 pages, 37749 KB

Open AccessArticle

MHLDet: A Multi-Scale and High-Precision Lightweight Object Detector Based on Large Receptive Field and Attention Mechanism for Remote Sensing Images

by Liming Zhou, Hang Zhao, Zhehao Liu, Kun Cai, Yang Liu and Xianyu Zuo

Remote Sens. 2023, 15(18), 4625; https://doi.org/10.3390/rs15184625 - 20 Sep 2023

Cited by 4 | Viewed by 2102

Abstract

Object detection in remote sensing images (RSIs) has become crucial in recent years. However, researchers often prioritize detecting small objects, neglecting medium- to large-sized ones. Moreover, detecting objects hidden in shadows is challenging. Additionally, most detectors have extensive parameters, leading to higher hardware [...] Read more.

Object detection in remote sensing images (RSIs) has become crucial in recent years. However, researchers often prioritize detecting small objects, neglecting medium- to large-sized ones. Moreover, detecting objects hidden in shadows is challenging. Additionally, most detectors have extensive parameters, leading to higher hardware costs. To address these issues, this paper proposes a multi-scale and high-precision lightweight object detector named MHLDet. Firstly, we integrated the SimAM attention mechanism into the backbone and constructed a new feature-extraction module called validity-neat feature extract (VNFE). This module captures more feature information while simultaneously reducing the number of parameters. Secondly, we propose an improved spatial pyramid pooling model, named SPPE, to integrate multi-scale feature information better, enhancing the model to detect multi-scale objects. Finally, this paper introduces the convolution aggregation crosslayer (CACL) into the network. This module can reduce the size of the feature map and enhance the ability to fuse context information, thereby obtaining a feature map with more semantic information. We performed evaluation experiments on both the SIMD dataset and the UCAS-AOD dataset. Compared to other methods, our approach achieved the highest detection accuracy. Furthermore, it reduced the number of parameters by 12.7% compared to YOLOv7-Tiny. The experimental results illustrated that our proposed method is more lightweight and exhibits superior detection accuracy compared to other lightweight models. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II)

► Show Figures

Graphical abstract

24 pages, 11641 KB

Open AccessArticle

ARE-Net: An Improved Interactive Model for Accurate Building Extraction in High-Resolution Remote Sensing Imagery

by Qian Weng, Qin Wang, Yifeng Lin and Jiawen Lin

Remote Sens. 2023, 15(18), 4457; https://doi.org/10.3390/rs15184457 - 10 Sep 2023

Cited by 3 | Viewed by 2165

Abstract

Accurate building extraction for high-resolution remote sensing images is critical for topographic mapping, urban planning, and many other applications. Its main task is to label each pixel point as a building or non-building. Although deep-learning-based algorithms have significantly enhanced the accuracy of building [...] Read more.

Accurate building extraction for high-resolution remote sensing images is critical for topographic mapping, urban planning, and many other applications. Its main task is to label each pixel point as a building or non-building. Although deep-learning-based algorithms have significantly enhanced the accuracy of building extraction, fully automated methods for building extraction are limited by the requirement for a large number of annotated samples, resulting in a limited generalization ability, easy misclassification in complex remote sensing images, and higher costs due to the need for a large number of annotated samples. To address these challenges, this paper proposes an improved interactive building extraction model, ARE-Net, which adopts a deep interactive segmentation approach. In this paper, we present several key contributions. Firstly, an adaptive-radius encoding (ARE) module was designed to optimize the interaction features of clicks based on the varying shapes and distributions of buildings to provide maximum a priori information for building extraction. Secondly, a two-stage training strategy was proposed to enhance the convergence speed and efficiency of the segmentation process. Finally, some comprehensive experiments using two models of different sizes (HRNet18s+OCR and HRNet32+OCR) were conducted on the Inria and WHU building datasets. The results showed significant improvements over the current state-of-the-art method in terms of

N o C_{90}

. The proposed method achieved performance enhancements of 7.98% and 13.03% with HRNet18s+OCR and 7.34% and 15.49% with HRNet32+OCR on the WHU and Inria datasets, respectively. Furthermore, the experiments demonstrated that the proposed ARE-Net method significantly reduced the annotation costs while improving the convergence speed and generalization performance. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II)

► Show Figures

Figure 1

21 pages, 13149 KB

Open AccessArticle

A Split-Frequency Filter Network for Hyperspectral Image Classification

by Jinfu Gong, Fanming Li, Jian Wang, Zhengye Yang and Xuezhuan Ding

Remote Sens. 2023, 15(15), 3900; https://doi.org/10.3390/rs15153900 - 7 Aug 2023

Cited by 5 | Viewed by 2602

Abstract

The intricate structure of hyperspectral images comprising hundreds of successive spectral bands makes it challenging for conventional approaches to quickly and precisely classify this information. The classification performance of hyperspectral images has substantially improved in the past decade with the emergence of deep-learning-based [...] Read more.

The intricate structure of hyperspectral images comprising hundreds of successive spectral bands makes it challenging for conventional approaches to quickly and precisely classify this information. The classification performance of hyperspectral images has substantially improved in the past decade with the emergence of deep-learning-based techniques. Due to convolutional neural networks’(CNNs) excellent feature extraction and modeling, they have become a robust backbone network for hyperspectral image classification. However, CNNs fail to adequately capture the dependency and contextual information of the sequence of spectral properties due to the restrictions inherent in their fundamental network characteristics. We analyzed hyperspectral image classification from a frequency-domain angle to tackle this issue and proposed a split-frequency filter network. It is a simple and effective network architecture that improves the performance of hyperspectral image classification through three critical operations: a split-frequency filter network, a detail-enhancement layer, and a nonlinear unit. Firstly, a split-frequency filtering network captures the interactions between neighboring spectral bands in the frequency domain. The classification performance is then enhanced using a detail-improvement layer with a frequency-domain attention technique. Finally, a nonlinear unit is incorporated into the frequency-domain output layer to expedite training and boost performance. Experiments on various hyperspectral datasets demonstrate that the method outperforms other state-of-art approaches (an overall accuracy(OA) improvement of at least 2%), particularly when the training sample is insufficient. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II)

► Show Figures

Graphical abstract

21 pages, 9733 KB

Open AccessArticle

Impact of Horizontal Resolution on the Robustness of Radiation Emulators in a Numerical Weather Prediction Model

by Hwan-Jin Song and Soonyoung Roh

Remote Sens. 2023, 15(10), 2637; https://doi.org/10.3390/rs15102637 - 18 May 2023

Cited by 2 | Viewed by 2041

Abstract

Developing a machine-learning-based radiative transfer emulator in a weather forecasting model is valuable because it can significantly improve the computational speed of forecasting severe weather events. To replace the radiative transfer parameterization in the weather forecasting model, the universal applicability of the radiation [...] Read more.

Developing a machine-learning-based radiative transfer emulator in a weather forecasting model is valuable because it can significantly improve the computational speed of forecasting severe weather events. To replace the radiative transfer parameterization in the weather forecasting model, the universal applicability of the radiation emulator is essential, indicating a transition from the research to the operational level. This study investigates the degradation of the forecast accuracy of the radiation emulator for the Korea peninsula when it is tested at different horizontal resolutions (100–0.25 km) concerning the accuracy attained at the training resolution (5 km) for universal applications. In real-case simulations (100–5 km), the forecast errors of radiative fluxes and precipitation were reduced at coarse resolutions. Ideal-case simulations (5–0.25 km) showed larger errors in heating rates and fluxes at fine resolutions, implying the difficulty in predicting heating rates and fluxes at cloud-resolving scales. However, all simulations maintained an appropriate accuracy range compared with observations in real-case simulations or the infrequent use of radiative transfer parameterization in ideal-case simulations. These findings demonstrate the feasibility of a universal radiation emulator associated with different resolutions/models and emphasize the importance of emulating high-resolution modeling in the future. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II)

► Show Figures

Figure 1

19 pages, 1145 KB

Open AccessEditor’s ChoiceArticle

CRSNet: Cloud and Cloud Shadow Refinement Segmentation Networks for Remote Sensing Imagery

by Chao Zhang, Liguo Weng, Li Ding, Min Xia and Haifeng Lin

Remote Sens. 2023, 15(6), 1664; https://doi.org/10.3390/rs15061664 - 20 Mar 2023

Cited by 34 | Viewed by 3956

Abstract

Cloud detection is a critical task in remote sensing image tasks. Due to the influence of ground objects and other noises, the traditional detection methods are prone to miss or false detection and rough edge segmentation in the detection process. To avoid the [...] Read more.

Cloud detection is a critical task in remote sensing image tasks. Due to the influence of ground objects and other noises, the traditional detection methods are prone to miss or false detection and rough edge segmentation in the detection process. To avoid the defects of traditional methods, Cloud and Cloud Shadow Refinement Segmentation Networks are proposed in this paper. The network can correctly and efficiently detect smaller clouds and obtain finer edges. The model takes ResNet-18 as the backbone to extract features at different levels, and the Multi-scale Global Attention Module is used to strengthen the channel and spatial information to improve the accuracy of detection. The Strip Pyramid Channel Attention Module is used to learn spatial information at multiple scales to detect small clouds better. Finally, the high-dimensional feature and low-dimensional feature are fused by the Hierarchical Feature Aggregation Module, and the final segmentation effect is obtained by up-sampling layer by layer. The proposed model attains excellent results compared to methods with classic or special cloud segmentation tasks on Cloud and Cloud Shadow Dataset and the public dataset CSWV. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II)

► Show Figures

Figure 1

25 pages, 8188 KB

Open AccessArticle

Construction Site Multi-Category Target Detection System Based on UAV Low-Altitude Remote Sensing

by Han Liang, Jongyoung Cho and Suyoung Seo

Remote Sens. 2023, 15(6), 1560; https://doi.org/10.3390/rs15061560 - 13 Mar 2023

Cited by 9 | Viewed by 3576

Abstract

On-site management of construction sites has always been a significant problem faced by the construction industry. With the development of UAVs, their use to monitor construction safety and progress will make construction more intelligent. This paper proposes a multi-category target detection system based [...] Read more.

On-site management of construction sites has always been a significant problem faced by the construction industry. With the development of UAVs, their use to monitor construction safety and progress will make construction more intelligent. This paper proposes a multi-category target detection system based on UAV low-altitude remote sensing, aiming to solve the problems of relying on fixed-position cameras and a single category of established detection targets when mainstream target detection algorithms are applied to construction supervision. The experimental results show that the proposed method can accurately and efficiently detect 15 types of construction site targets. In terms of performance, the proposed method achieves the highest accuracy in each category compared to other networks, with a mean average precision (mAP) of 82.48%. Additionally, by applying it to the actual construction site, the proposed system is confirmed to have comprehensive detection capability and robustness. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II)

► Show Figures

Figure 1

17 pages, 8037 KB

Open AccessArticle

YOLO-HR: Improved YOLOv5 for Object Detection in High-Resolution Optical Remote Sensing Images

by Dahang Wan, Rongsheng Lu, Sailei Wang, Siyuan Shen, Ting Xu and Xianli Lang

Remote Sens. 2023, 15(3), 614; https://doi.org/10.3390/rs15030614 - 20 Jan 2023

Cited by 72 | Viewed by 13078

Abstract

Object detection is essential to the interpretation of optical remote sensing images and can serve as a foundation for research into additional visual tasks that utilize remote sensing. However, the object detection network currently employed in optical remote sensing images underutilizes the output [...] Read more.

Object detection is essential to the interpretation of optical remote sensing images and can serve as a foundation for research into additional visual tasks that utilize remote sensing. However, the object detection network currently employed in optical remote sensing images underutilizes the output of the feature pyramid, so there remains potential for an improved detection. At present, a suitable balance between the detection efficiency and detection effect is difficult to attain. This paper proposes an enhanced YOLOv5 algorithm for object detection in high-resolution optical remote sensing images, utilizing multiple layers of the feature pyramid, a multi-detection-head strategy, and a hybrid attention module to improve the effect of object-detection networks for use with optical remote sensing images. According to the SIMD dataset, the mAP of the proposed method was 2.2% better than YOLOv5 and 8.48% better than YOLOX, achieving an improved balance between the detection effect and speed. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II)

► Show Figures

Figure 1

17 pages, 2769 KB

Open AccessArticle

Mapping Dwellings in IDP/Refugee Settlements Using Deep Learning

by Omid Ghorbanzadeh, Alessandro Crivellari, Dirk Tiede, Pedram Ghamisi and Stefan Lang

Remote Sens. 2022, 14(24), 6382; https://doi.org/10.3390/rs14246382 - 16 Dec 2022

Cited by 2 | Viewed by 4326

Abstract

The improvement in computer vision, sensor quality, and remote sensing data availability makes satellite imagery increasingly useful for studying human settlements. Several challenges remain to be overcome for some types of settlements, particularly for internally displaced populations (IDPs) and refugee camps. Refugee-dwelling footprints [...] Read more.

The improvement in computer vision, sensor quality, and remote sensing data availability makes satellite imagery increasingly useful for studying human settlements. Several challenges remain to be overcome for some types of settlements, particularly for internally displaced populations (IDPs) and refugee camps. Refugee-dwelling footprints and detailed information derived from satellite imagery are critical for a variety of applications, including humanitarian aid during disasters or conflicts. Nevertheless, extracting dwellings remains difficult due to their differing sizes, shapes, and location variations. In this study, we use U-Net and residual U-Net to deal with dwelling classification in a refugee camp in northern Cameroon, Africa. Specifically, two semantic segmentation networks are adapted and applied. A limited number of randomly divided sample patches is used to train and test the networks based on a single image of the WorldView-3 satellite. Our accuracy assessment was conducted using four different dwelling categories for classification purposes, using metrics such as Precision, Recall, F1, and Kappa coefficient. As a result, F1 ranges from 81% to over 99% and approximately 88.1% to 99.5% based on the U-Net and the residual U-Net, respectively. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II)

► Show Figures

Figure 1

24 pages, 19513 KB

Open AccessArticle

Fusion of Remote Sensing, Magnetometric, and Geological Data to Identify Polymetallic Mineral Potential Zones in Chakchak Region, Yazd, Iran

by Ali Akbar Aali, Adel Shirazy, Aref Shirazi, Amin Beiranvand Pour, Ardeshir Hezarkhani, Abbas Maghsoudi, Mazlan Hashim and Shayan Khakmardan

Remote Sens. 2022, 14(23), 6018; https://doi.org/10.3390/rs14236018 - 28 Nov 2022

Cited by 34 | Viewed by 4753

Abstract

Exploration geologists are urged to develop new, robust, and low-cost approaches to identify high potential zones related to underground/unexplored mineral deposits because of increased depletion of ore deposits and high consumption of basic metal production industries. Fusing remote sensing, geophysical and geological data [...] Read more.

Exploration geologists are urged to develop new, robust, and low-cost approaches to identify high potential zones related to underground/unexplored mineral deposits because of increased depletion of ore deposits and high consumption of basic metal production industries. Fusing remote sensing, geophysical and geological data has great capability to provide a complete range of prerequisite data to accomplish this purpose. This investigation fuses remote sensing data, such as Sentinel-2 and Landsat 7, aerial magnetic geophysical data, and geological data for identifying polymetallic mineralization potential zones in the Chakchak region, Yazd province, Iran. Hydrothermal alteration mineral zones and surface and deep intrusive masses, hidden faults and lineaments, and lithological units were detected using remote sensing, aerial magnetic, and geological data, respectively. The exploratory/information layers were fused using fuzzy logic modeling and the multi-class index overlap method. Subsequently, mineral potential maps were generated for the study area. Some high potential zones of polymetallic mineralization were identified and verified through a detailed field campaign and drilling programs in the Chakchak region. In conclusion, the fusion of remote sensing, geophysical, and geological data using fuzzy logic modeling and the multi-class index overlap method is a robust, reliable, and low-cost approach for mining companies to explore the frontier areas with identical geologic conditions that are alleged to indicate polymetallic mineralization potential. Full article

(This article belongs to the Special Issue Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Related Special Issue

Published Papers (17 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI