remotesensing-logo

Journal Browser

Journal Browser

3D Information Recovery and 2D Image Processing for Remotely Sensed Optical Images II

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: closed (31 October 2024) | Viewed by 14183

Special Issue Editors

School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
Interests: image processing; texture mapping; photogrammetry
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Control Science and Engineering, Shandong University, Jinan 250061, China
Interests: computer vision; machine learning; robotics
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
Interests: computer vision; SLAM; artificial intelligence; LiDAR point clouds processing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Dept. Information Engineering and Mathematics, University of Siena, Via Roma, 56, I-53100 Siena, Italy
Interests: remote sensing; image fusion
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Dept. Information Engineering and Mathematics, University of Siena, Via Roma, 56, I-53100 Siena, Italy
Interests: remote sensing; image/video processing; computer vision; machine learning/artificial intelligence; Interferometry; LiDAR/LASER 3D reconstruction

Special Issue Information

Dear Colleagues,

Due to the overwhelming support and interest in the previous Special Issue (SI), we are introducing a second edition on “3D Information Recovery and 2D Image Processing for Remotely Sensed Optical Images”. We would like to thank all the authors and co-authors who contributed to the success of the first edition of this SI.

In the photogrammetry and remote sensing fields, an important and longstanding task is the recovery of the 3D information of scenes, followed by the generation of visually appealing digital orthophoto maps (DOMs) with rich semantic information. Remotely sensed optical images are one of the most widely used data sources. The key technologies of this task include 3D information recovery and 2D image processing. Recently, with the development of deep-learning techniques, many deep-learning-based methods have been proposed in the computer vision field to recover the 3D information of scenes, enhance the image quality, and acquire semantic information. However, almost all of these methods focus on photos taken by smart mobile phones or SLR cameras. Few works have explored these recent advances in remote sensing. Thus, we aim to collect recent research works related to “3D Information Recovery and 2D Image Processing for Remotely Sensed Optical Images”. We invite you to participate in this Special issue by submitting articles. Topics of particular interest include, but are not limited to, the following:

  • Feature matching and outlier detection for remote sensing image matching;
  • Pose estimation from 2D remote sensing images;
  • Dense matching of images acquired by remote sensing for 3D reconstruction;
  • Depth estimation of images acquired by remote sensing;
  • Texture mapping for 3D models;
  • Digital elevation model generation from remotely sensed images;
  • Digital orthophoto map generation;
  • Image stitching and color correction for remotely sensed images;
  • Enhancement, denoising, and super-resolution of images acquired by remote sensing;
  • Semantic segmentation and object detection for images obtained by remote sensing.

Dr. Li Li
Prof. Dr. Wei Zhang
Prof. Dr. Jian Yao
Prof. Dr. Andrea Garzelli
Dr. Claudia Zoppetti
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning
  • remote sensing image processing
  • feature matching
  • dense matching
  • pose estimation
  • 3D reconstruction
  • semantic segmentation
  • object detection
  • image stitching
  • image enhancement
  • image denoising
  • image super-resolution
  • digital elevation model (DEM)
  • digital orthophoto map (DOM)

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Related Special Issue

Published Papers (10 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

16 pages, 9232 KiB  
Article
DSM Reconstruction from Uncalibrated Multi-View Satellite Stereo Images by RPC Estimation and Integration
by Dong-Uk Seo and Soon-Yong Park
Remote Sens. 2024, 16(20), 3863; https://doi.org/10.3390/rs16203863 - 17 Oct 2024
Viewed by 537
Abstract
In this paper, we propose a 3D Digital Surface Model (DSM) reconstruction method from uncalibrated Multi-view Satellite Stereo (MVSS) images, where Rational Polynomial Coefficient (RPC) sensor parameters are not available. While recent investigations have introduced several techniques to reconstruct high-precision and high-density DSMs [...] Read more.
In this paper, we propose a 3D Digital Surface Model (DSM) reconstruction method from uncalibrated Multi-view Satellite Stereo (MVSS) images, where Rational Polynomial Coefficient (RPC) sensor parameters are not available. While recent investigations have introduced several techniques to reconstruct high-precision and high-density DSMs from MVSS images, they inherently depend on the use of geo-corrected RPC sensor parameters. However, RPC parameters from satellite sensors are subject to being erroneous due to inaccurate sensor data. In addition, due to the increasing data availability from the internet, uncalibrated satellite images can be easily obtained without RPC parameters. This study proposes a novel method to reconstruct a 3D DSM from uncalibrated MVSS images by estimating and integrating RPC parameters. To do this, we first employ a structure from motion (SfM) and 3D homography-based geo-referencing method to reconstruct an initial DSM. Second, we sample 3D points from the initial DSM as references and reproject them to the 2D image space to determine 3D–2D correspondences. Using the correspondences, we directly calculate all RPC parameters. To overcome the memory shortage problem while running the large size of satellite images, we also propose an RPC integration method. Image space is partitioned to multiple tiles, and RPC estimation is performed independently in each tile. Then, all tiles’ RPCs are integrated into the final RPC to represent the geometry of the whole image space. Finally, the integrated RPC is used to run a true MVSS pipeline to obtain the 3D DSM. The experimental results show that the proposed method can achieve 1.455 m Mean Absolute Error (MAE) in the height map reconstruction from multi-view satellite benchmark datasets. We also show that the proposed method can be used to reconstruct a geo-referenced 3D DSM from uncalibrated and freely available Google Earth imagery. Full article
Show Figures

Figure 1

18 pages, 6689 KiB  
Article
Towards a Novel Generative Adversarial Network-Based Framework for Remote Sensing Image Demosaicking
by Yuxuan Guo, Xuemin Zhang and Guang Jin
Remote Sens. 2024, 16(13), 2283; https://doi.org/10.3390/rs16132283 - 22 Jun 2024
Cited by 1 | Viewed by 651
Abstract
During satellite remote sensing imaging, the use of Bayer mode sensors holds significant importance in saving airborne computing resources and reducing the burden of satellite transmission systems. The demosaicing techniques play a key role in this process. The integration of Generative Adversarial Networks [...] Read more.
During satellite remote sensing imaging, the use of Bayer mode sensors holds significant importance in saving airborne computing resources and reducing the burden of satellite transmission systems. The demosaicing techniques play a key role in this process. The integration of Generative Adversarial Networks (GANs) has garnered significant interest in the realm of image demosaicking, owing to their ability to generate intricate details. However, when demosaicing mosaic images in remote sensing techniques, GANs, although capable of generating rich details, often introduce unpleasant artifacts while generating content. To address this challenge and differentiate between undesirable artifacts and realistic details, we have devised a novel framework based on a Progressive Discrimination Strategy within a Generative Adversarial Network architecture for image demosaicking. Our approach incorporates an artifact-weighted Location Map refinement technique to guide the optimization process towards generating authentic details in a stable and precise manner. Furthermore, our framework integrates a global attention mechanism to boost the interaction of spatial-channel information across different dimensions, thereby enhancing the performance of the generator network. Moreover, we conduct a comparative analysis of various prevalent attention mechanisms in the context of remote sensing image demosaicking. The experimental findings unequivocally demonstrate that our proposed methodology not only achieves superior reconstruction accuracy on the dataset but also enhances the perceptual quality of the generated images. By effectively mitigating artifacts and emphasizing the generation of true details, our approach represents a significant advancement in the field of remote sensing image demosaicking, promising enhanced visual fidelity and realism in reconstructed images. Full article
Show Figures

Figure 1

23 pages, 8731 KiB  
Article
Development of a High-Precision Lidar System and Improvement of Key Steps for Railway Obstacle Detection Algorithm
by Zongliang Nan, Guoan Zhu, Xu Zhang, Xuechun Lin and Yingying Yang
Remote Sens. 2024, 16(10), 1761; https://doi.org/10.3390/rs16101761 - 16 May 2024
Cited by 3 | Viewed by 3413
Abstract
In response to the growing demand for railway obstacle monitoring, lidar technology has emerged as an up-and-coming solution. In this study, we developed a mechanical 3D lidar system and meticulously calibrated the point cloud transformation to monitor specific areas precisely. Based on this [...] Read more.
In response to the growing demand for railway obstacle monitoring, lidar technology has emerged as an up-and-coming solution. In this study, we developed a mechanical 3D lidar system and meticulously calibrated the point cloud transformation to monitor specific areas precisely. Based on this foundation, we have devised a novel set of algorithms for obstacle detection within point clouds. These algorithms encompass three key steps: (a) the segmentation of ground point clouds and extraction of track point clouds using our RS-Lo-RANSAC (region select Lo-RANSAC) algorithm; (b) the registration of the BP (background point cloud) and FP (foreground point cloud) via an improved Robust ICP algorithm; and (c) obstacle recognition based on the VFOR (voxel-based feature obstacle recognition) algorithm from the fused point clouds. This set of algorithms has demonstrated robustness and operational efficiency in our experiments on a dataset obtained from an experimental field. Notably, it enables monitoring obstacles with dimensions of 15 cm × 15 cm × 15 cm. Overall, our study showcases the immense potential of lidar technology in railway obstacle monitoring, presenting a promising solution to enhance safety in this field. Full article
Show Figures

Figure 1

19 pages, 12560 KiB  
Article
A Multiscale Attention Segment Network-Based Semantic Segmentation Model for Landslide Remote Sensing Images
by Nan Zhou, Jin Hong, Wenyu Cui, Shichao Wu and Ziheng Zhang
Remote Sens. 2024, 16(10), 1712; https://doi.org/10.3390/rs16101712 - 11 May 2024
Cited by 5 | Viewed by 1490
Abstract
Landslide disasters have garnered significant attention due to their extensive devastating impact, leading to a growing emphasis on the prompt and precise identification and detection of landslides as a prominent area of research. Previous research has primarily relied on human–computer interactions and visual [...] Read more.
Landslide disasters have garnered significant attention due to their extensive devastating impact, leading to a growing emphasis on the prompt and precise identification and detection of landslides as a prominent area of research. Previous research has primarily relied on human–computer interactions and visual interpretation from remote sensing to identify landslides. However, these methods are time-consuming, labor-intensive, subjective, and have a low level of accuracy in extracting data. An essential task in deep learning, semantic segmentation, has been crucial to automated remote sensing image recognition tasks because of its end-to-end pixel-level classification capability. In this study, to mitigate the disadvantages of existing landslide detection methods, we propose a multiscale attention segment network (MsASNet) that acquires different scales of remote sensing image features, designs an encoder–decoder structure to strengthen the landslide boundary, and combines the channel attention mechanism to strengthen the feature extraction capability. The MsASNet model exhibited an average accuracy of 95.13% on the test set from Bijie’s landslide dataset, a mean accuracy of 91.45% on the test set from Chongqing’s landslide dataset, and a mean accuracy of 90.17% on the test set from Tianshui‘s landslide dataset, signifying its ability to extract landslide information efficiently and accurately in real time. Our proposed model may be used in efforts toward the prevention and control of geological disasters. Full article
Show Figures

Figure 1

27 pages, 14613 KiB  
Article
A UAV-Based Single-Lens Stereoscopic Photography Method for Phenotyping the Architecture Traits of Orchard Trees
by Wenli Zhang, Xinyu Peng, Tingting Bai, Haozhou Wang, Daisuke Takata and Wei Guo
Remote Sens. 2024, 16(9), 1570; https://doi.org/10.3390/rs16091570 - 28 Apr 2024
Cited by 1 | Viewed by 1145
Abstract
This article addresses the challenges of measuring the 3D architecture traits, such as height and volume, of fruit tree canopies, constituting information that is essential for assessing tree growth and informing orchard management. The traditional methods are time-consuming, prompting the need for efficient [...] Read more.
This article addresses the challenges of measuring the 3D architecture traits, such as height and volume, of fruit tree canopies, constituting information that is essential for assessing tree growth and informing orchard management. The traditional methods are time-consuming, prompting the need for efficient alternatives. Recent advancements in unmanned aerial vehicle (UAV) technology, particularly using Light Detection and Ranging (LiDAR) and RGB cameras, have emerged as promising solutions. LiDAR offers precise 3D data but is costly and computationally intensive. RGB and photogrammetry techniques like Structure from Motion and Multi-View Stereo (SfM-MVS) can be a cost-effective alternative to LiDAR, but the computational demands still exist. This paper introduces an innovative approach using UAV-based single-lens stereoscopic photography to overcome these limitations. This method utilizes color variations in canopies and a dual-image-input network to generate a detailed canopy height map (CHM). Additionally, a block structure similarity method is presented to enhance height estimation accuracy in single-lens UAV photography. As a result, the average rates of growth in canopy height (CH), canopy volume (CV), canopy width (CW), and canopy project area (CPA) were 3.296%, 9.067%, 2.772%, and 5.541%, respectively. The r2 values of CH, CV, CW, and CPA were 0.9039, 0.9081, 0.9228, and 0.9303, respectively. In addition, compared to the commonly used SFM-MVS approach, the proposed method reduces the time cost of canopy reconstruction by 95.2% and of the cost of images needed for canopy reconstruction by 88.2%. This approach allows growers and researchers to utilize UAV-based approaches in actual orchard environments without incurring high computation costs. Full article
Show Figures

Figure 1

20 pages, 4604 KiB  
Article
Full-Process Adaptive Encoding and Decoding Framework for Remote Sensing Images Based on Compression Sensing
by Huiling Hu, Chunyu Liu, Shuai Liu, Shipeng Ying, Chen Wang and Yi Ding
Remote Sens. 2024, 16(9), 1529; https://doi.org/10.3390/rs16091529 - 26 Apr 2024
Viewed by 870
Abstract
Faced with the problem of incompatibility between traditional information acquisition mode and spaceborne earth observation tasks, starting from the general mathematical model of compressed sensing, a theoretical model of block compressed sensing was established, and a full-process adaptive coding and decoding compressed sensing [...] Read more.
Faced with the problem of incompatibility between traditional information acquisition mode and spaceborne earth observation tasks, starting from the general mathematical model of compressed sensing, a theoretical model of block compressed sensing was established, and a full-process adaptive coding and decoding compressed sensing framework for remote sensing images was proposed, which includes five parts: mode selection, feature factor extraction, adaptive shape segmentation, adaptive sampling rate allocation and image reconstruction. Unlike previous semi-adaptive or local adaptive methods, the advantages of the adaptive encoding and decoding method proposed in this paper are mainly reflected in four aspects: (1) Ability to select encoding modes based on image content, and maximizing the use of the richness of the image to select appropriate sampling methods; (2) Capable of utilizing image texture details for adaptive segmentation, effectively separating complex and smooth regions; (3) Being able to detect the sparsity of encoding blocks and adaptively allocate sampling rates to fully explore the compressibility of images; (4) The reconstruction matrix can be adaptively selected based on the size of the encoding block to alleviate block artifacts caused by non-stationary characteristics of the image. Experimental results show that the method proposed in this article has good stability for remote sensing images with complex edge textures, with the peak signal-to-noise ratio and structural similarity remaining above 35 dB and 0.8. Moreover, especially for ocean images with relatively simple image content, when the sampling rate is 0.26, the peak signal-to-noise ratio reaches 50.8 dB, and the structural similarity is 0.99. In addition, the recovered images have the smallest BRISQUE value, with better clarity and less distortion. In the subjective aspect, the reconstructed image has clear edge details and good reconstruction effect, while the block effect is effectively suppressed. The framework designed in this paper is superior to similar algorithms in both subjective visual and objective evaluation indexes, which is of great significance for alleviating the incompatibility between traditional information acquisition methods and satellite-borne earth observation missions. Full article
Show Figures

Figure 1

18 pages, 21669 KiB  
Article
Shadow-Aware Point-Based Neural Radiance Fields for High-Resolution Remote Sensing Novel View Synthesis
by Li Li, Yongsheng Zhang, Ziquan Wang, Zhenchao Zhang, Zhipeng Jiang, Ying Yu, Lei Li and Lei Zhang
Remote Sens. 2024, 16(8), 1341; https://doi.org/10.3390/rs16081341 - 11 Apr 2024
Viewed by 1014
Abstract
Novel view synthesis using neural radiance fields (NeRFs) for remote sensing images is important for various applications. Traditional methods often use implicit representations for modeling, which have slow rendering speeds and cannot directly obtain the structure of the 3D scene. Some studies have [...] Read more.
Novel view synthesis using neural radiance fields (NeRFs) for remote sensing images is important for various applications. Traditional methods often use implicit representations for modeling, which have slow rendering speeds and cannot directly obtain the structure of the 3D scene. Some studies have introduced explicit representations, such as point clouds and voxels, but this kind of method often produces holes when processing large-scale scenes from remote sensing images. In addition, NeRFs with explicit 3D expression are more susceptible to transient phenomena (shadows and dynamic objects) and even plane holes. In order to address these issues, we propose an improved method for synthesizing new views of remote sensing images based on Point-NeRF. Our main idea focuses on two aspects: filling in the spatial structure and reconstructing ray-marching rendering using shadow information. First, we introduce hole detection, conducting inverse projection to acquire candidate points that are adjusted during training to fill the holes. We also design incremental weights to reduce the probability of pruning the plane points. We introduce a geometrically consistent shadow model based on a point cloud to divide the radiance into albedo and irradiance, allowing the model to predict the albedo of each point, rather than directly predicting the radiance. Intuitively, our proposed method uses a sparse point cloud generated with traditional methods for initialization and then builds the dense radiance field. We evaluate our method on the LEVIR_NVS data set, demonstrating its superior performance compared to state-of-the-art methods. Overall, our work provides a promising approach for synthesizing new viewpoints of remote sensing images. Full article
Show Figures

Graphical abstract

22 pages, 7134 KiB  
Article
End-to-End Edge-Guided Multi-Scale Matching Network for Optical Satellite Stereo Image Pairs
by Yixin Luo, Hao Wang and Xiaolei Lv
Remote Sens. 2024, 16(5), 882; https://doi.org/10.3390/rs16050882 - 2 Mar 2024
Cited by 1 | Viewed by 1193
Abstract
Acquiring disparity maps by dense stereo matching is one of the most important methods for producing digital surface models. However, the characteristics of optical satellite imagery, including significant occlusions and long baselines, increase the challenges of dense matching. In this study, we propose [...] Read more.
Acquiring disparity maps by dense stereo matching is one of the most important methods for producing digital surface models. However, the characteristics of optical satellite imagery, including significant occlusions and long baselines, increase the challenges of dense matching. In this study, we propose an end-to-end edge-guided multi-scale matching network (EGMS-Net) tailored for optical satellite stereo image pairs. Using small convolutional filters and residual blocks, the EGMS-Net captures rich high-frequency signals during the initial feature extraction phase. Subsequently, pyramid features are derived through efficient down-sampling and consolidated into cost volumes. To regularize these cost volumes, we design a top–down multi-scale fusion network that integrates an attention mechanism. Finally, we innovate the use of trainable guided filter layers in disparity refinement to improve edge detail recovery. The network is trained and evaluated using the Urban Semantic 3D and WHU-Stereo datasets, with subsequent analysis of the disparity maps. The results show that the EGMS-Net provides superior results, achieving endpoint errors of 1.515 and 2.459 pixels, respectively. In challenging scenarios, particularly in regions with textureless surfaces and dense buildings, our network consistently delivers satisfactory matching performance. In addition, EGMS-Net reduces training time and increases network efficiency, improving overall results. Full article
Show Figures

Graphical abstract

20 pages, 3353 KiB  
Article
ISHS-Net: Single-View 3D Reconstruction by Fusing Features of Image and Shape Hierarchical Structures
by Guoqing Gao, Liang Yang, Quan Zhang, Chongmin Wang, Hua Bao and Changhui Rao
Remote Sens. 2023, 15(23), 5449; https://doi.org/10.3390/rs15235449 - 22 Nov 2023
Viewed by 1571
Abstract
The reconstruction of 3D shapes from a single view has been a longstanding challenge. Previous methods have primarily focused on learning either geometric features that depict overall shape contours but are insufficient for occluded regions, local features that capture details but cannot represent [...] Read more.
The reconstruction of 3D shapes from a single view has been a longstanding challenge. Previous methods have primarily focused on learning either geometric features that depict overall shape contours but are insufficient for occluded regions, local features that capture details but cannot represent the complete structure, or structural features that encode part relationships but require predefined semantics. However, the fusion of geometric, local, and structural features has been lacking, leading to inaccurate reconstruction of shapes with occlusions or novel compositions. To address this issue, we propose a two-stage approach for achieving 3D shape reconstruction. In the first stage, we encode the hierarchical structure features of the 3D shape using an encoder-decoder network. In the second stage, we enhance the hierarchical structure features by fusing them with global and point features and feed the enhanced features into a signed distance function (SDF) prediction network to obtain rough SDF values. Using the camera pose, we project arbitrary 3D points in space onto different depth feature maps of the CNN and obtain their corresponding positions. Then, we concatenate the features of these corresponding positions together to form local features. These local features are also fed into the SDF prediction network to obtain fine-grained SDF values. By fusing the two sets of SDF values, we improve the accuracy of the model and enable it to reconstruct other object types with higher quality. Comparative experiments demonstrate that the proposed method outperforms state-of-the-art approaches in terms of accuracy. Full article
Show Figures

Figure 1

Other

Jump to: Research

19 pages, 25201 KiB  
Technical Note
Disparity Refinement for Stereo Matching of High-Resolution Remote Sensing Images Based on GIS Data
by Xuanqi Wang, Liting Jiang, Feng Wang, Hongjian You and Yuming Xiang
Remote Sens. 2024, 16(3), 487; https://doi.org/10.3390/rs16030487 - 26 Jan 2024
Cited by 1 | Viewed by 1500
Abstract
With the emergence of the Smart City concept, the rapid advancement of urban three-dimensional (3D) reconstruction becomes imperative. While current developments in the field of 3D reconstruction have enabled the generation of 3D products such as Digital Surface Models (DSM), challenges persist in [...] Read more.
With the emergence of the Smart City concept, the rapid advancement of urban three-dimensional (3D) reconstruction becomes imperative. While current developments in the field of 3D reconstruction have enabled the generation of 3D products such as Digital Surface Models (DSM), challenges persist in accurately reconstructing shadows, handling occlusions, and addressing low-texture areas in very-high-resolution remote sensing images. These challenges often lead to difficulties in calculating satisfactory disparity maps using existing stereo matching methods, thereby reducing the accuracy of 3D reconstruction. This issue is particularly pronounced in urban scenes, which contain numerous super high-rise and densely distributed buildings, resulting in large disparity values and occluded regions in stereo image pairs, and further leading to a large number of mismatched points in the obtained disparity map. In response to these challenges, this paper proposes a method to refine the disparity in urban scenes based on open-source GIS data. First, we register the GIS data with the epipolar-rectified images since there always exists unignorable geolocation errors between them. Specifically, buildings with different heights present different offsets in GIS data registering; thus, we perform multi-modal matching for each building and merge them into the final building mask. Subsequently, a two-layer optimization process is applied to the initial disparity map based on the building mask, encompassing both global and local optimization. Finally, we perform a post-correction on the building facades to obtain the final refined disparity map that can be employed for high-precision 3D reconstruction. Experimental results on SuperView-1, GaoFen-7, and GeoEye satellite images show that the proposed method has the ability to correct the occluded and mismatched areas in the initial disparity map generated by both hand-crafted and deep-learning stereo matching methods. The DSM generated by the refined disparity reduces the average height error from 2.2 m to 1.6 m, which demonstrates superior performance compared with other disparity refinement methods. Furthermore, the proposed method is able to improve the integrity of the target structure and present steeper building facades and complete roofs, which are conducive to subsequent 3D model generation. Full article
Show Figures

Figure 1

Back to TopTop