Digital Surface Model Generation from Satellite Images Based on Double-Penalty Bundle Adjustment Optimization

Li, Henan; Yin, Junping; Jiao, Liguo

doi:10.3390/app14177777

Open AccessArticle

Digital Surface Model Generation from Satellite Images Based on Double-Penalty Bundle Adjustment Optimization

by

Henan Li

^1,2,

Junping Yin

^2,3,* and

Liguo Jiao

^1,2,*

¹

Academy for Advanced Interdisciplinary Studies, Northeast Normal University, Changchun 130024, China

²

Shanghai Zhangjiang Institute of Mathematics, Shanghai 201203, China

³

Institute of Applied Physics and Computational Mathematics, Beijing 100094, China

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2024, 14(17), 7777; https://doi.org/10.3390/app14177777

Submission received: 31 July 2024 / Revised: 28 August 2024 / Accepted: 29 August 2024 / Published: 3 September 2024

(This article belongs to the Section Earth Sciences)

Download

Browse Figures

Versions Notes

Abstract

:

Digital Surface Model (DSM) generation from high-resolution optical satellite images is an important topic of research in the remote sensing field. In optical satellite imaging systems, the attitude information of the cameras recorded by satellite sensors is often biased, which leads to errors in the Rational Polynomial Camera (RPC) model of satellite imaging. These errors in the RPC model can mislead the DSM generation. To solve the above problems, we propose an automatic DSM generation method from satellite images based on the Double-Penalty bundle adjustment (DPBA) optimization algorithm. In the proposed method, two penalty functions representing the camera’s attitude and the spatial 3D points, respectively, are added to the reprojection error model of the traditional bundle adjustment optimization algorithm. Instead of acting on images directly, the penalty functions are used to adjust the reprojection error model and improve the RPC parameters. We evaluate the performance of the proposed method using high-resolution satellite image pairs and multi-date satellite images. Through some experiments, we compare the accuracy and completeness of the DSM generated by the proposed method, the Satellite Stereo Pipeline (S2P) method, and the traditional bundle adjustment (BA) method. Compared to the S2P method, the experiment results of the satellite image pair indicate that the proposed method can significantly improve the accuracy and the completeness of the generated DSM by about 1–5 m and 20%–60% in most cases. Compared to the traditional BA method, the proposed method improves the accuracy and completeness of the generated DSM by about 0.01–0.05 m and 1%–3% in most cases. The experiment results can be a testament to the feasibility and effectiveness of the proposed method.

Keywords:

satellite images; DSM generation; bundle adjustment; RPC model

1. Introduction

The Earth observation technology has developed rapidly in the past few years, which has led to a large increase in high-resolution satellite images. The massive images provide powerful data support for Digital Surface Model (DSM) generation. Satellite image DSM generation technology can provide abundant spatial information and reliable data support for urban planning, resource management, environmental monitoring, disaster prevention, and other fields [1]. Therefore, the development of DSM generation technology for satellite images has important practical significance and application prospects.

According to the input data type, the DSM generation from satellite images can be divided into generation from image pairs and from multi-date images [2,3].

The DSM generation from image pairs requires stereo rectification [4], dense matching [5], and triangulation [6] to generate the 3D point cloud. The Satellite Stereo Pipeline (S2P) method [7] proposed by De Franchis et al. is an outstanding representative of the DSM generation of satellite image pairs, which corrects the relative pointing error of the image pairs and realizes automatic DSM generation [8,9]. As far as we know, the generation speed of the S2P method is pretty fast, but the accuracy of the generated DSM may be improved. Later, Qin proposed the RPC (Rational Polynomial Coefficient) Stereo Processor (RSP) [10] for DSM generation, and Qin et al. further combine the ground video frames with the satellite images for cross-view registration in the follow-up research to enrich the details of the generated DSM [11]. The later method can reduce holes in generation, but its pipeline is complicated. Recently, with the study of deep learning emerging, some researchers used deep learning algorithms to process the dense matching in the satellite images DSM generation pipeline and obtained some good generation results [12,13,14,15]. However, the deep-learning-based methods are uninterpretable and have weak generalization ability, moreover, they require a large amount of data and computing resources to train the model, so it is difficult to use them in some important tasks (such as emergency cartography). In recent years, the researchers have proposed some new multi-view pipelines of satellite imagery. For example, Michel et al. propose a stereo pipeline to deal with different landscapes and image defects [16]. Patil et al. generate a more accurate satellite stereo parallax map based on image registration and correction [17]. Some researchers propose DSM generation methods based on small satellite data [18] and reconstruction methods based on images collected by different satellites [19].

As roughly suggested by [20], the DSM generation from multi-date satellite images can be divided into two types: the multi-view stereo method based on dense stereo matching, and the true multi-view method that simultaneously triangulates all images.

The multi-view stereo method needs an image pair selection strategy, which selects two images with high correspondence to form a stereo image pair from the multi-date satellite images. Then the selected satellite image pair is used to generate multiple 3D point clouds. The multi-view stereo method finally aligns and fuses the 3D point clouds and then generates the final DSM. In this sense, the multi-view stereo method can be regarded as an extension of the DSM generation from satellite image pairs. For multi-date satellite images, the researchers of S2P proposed an image pair selection strategy based on the intersection angle and acquisition date to realize the automatic DSM generation [20]. Qin trained and analyzed the meta-data of intersection angle, sun azimuth angle, sun elevation angle, etc. with the Support Vector Machine (SVM) model [21] and gave an image pair selection strategy according to the SVM analysis results. This method is very comprehensive but requires a large amount of training data. Later, Qin et al. propose a scalable method that incorporates the matching uncertainty to adaptively guide the fusion process [22]. This method is effective for both flat areas and street details.

The true multi-view method requires the simultaneous processing of multi-view satellite images. Most of the research in this part focuses on the proposal of the new models or frameworks [23,24]. Recently, some researchers conducted a comprehensive analysis of whether computer vision methods can be applied to satellite imagery. The Structure from Motion (SFM) method is an important 3D reconstruction method in the computer vision community. The visual 3D reconstruction usually uses simple affine camera models, however, the satellite images captured by push-broom cameras usually use abstract rational function models. Therefore, the SFM-based visual 3D reconstruction method has been difficult to apply to satellite images. Actually, until 2019, Zhang et al. analyzed the similarities and differences between the multi-view stereo method and the visual 3D reconstruction method [25] and applied the SFM framework to satellite images by using the affine camera model to approximate the Rational Polynomial Camera (RPC) model in a very small area. Following this study, Bullinger et al. improved the application of the SFM method to the satellite images in terms of the skew correction [26]. Although the visual SFM method is well tried in satellite images, the accuracy of the obtained 3D point cloud is not better than that of the traditional multi-view stereo method. In addition, there is a reconstruction pipeline that combines the multi-view stereo method with the visual SFM method [27]. However, this pipeline is time-consuming and requires additional ground control points.

The RPC file serves as an important input for satellite image DSM generation, and the RPC model parameters should ideally be very accurate. But due to the fact that the satellite azimuth sensor cannot accurately record, there are often some biases in RPC parameters [28]. The traditional method of RPC model refinement is to modify the RPC parameters according to the ground control points and their corresponding image 2D points [29]. This method requires a lot of 2D-3D tie points to get accurate results. However, it is difficult to obtain lots of ground control points in some areas, so this method has geographical limitations. Nowadays, the bundle adjustment method based on the 2D-3D tie points has become the mainstream of the DSM generation based on RPC model refinement [30,31,32]. It is worth mentioning that this method still has problems, such as slow convergence.

In order to reduce the error of the RPC model parameters and solve the problem of slow convergence of the traditional bundle adjustment (BA) methods, this paper proposes a DSM generation method from satellite images based on the Double-Penalty bundle adjustment (DPBA) algorithm. The main contributions of this paper are as follows:

A new automatic DSM generation method of satellite images for building areas is proposed, which is not only suitable for satellite image pairs but also for multi-date satellite images; moreover, the proposed method can also generate DSM with high accuracy in the absence of ground control points.
The RPC parameter improvement method based on the DPBA optimization algorithm is creatively proposed. The penalty functions are constructed according to the physical significance of the optimization variables. We embedded the bundle adjustment optimization model in a two-layer ridge regression model and innovatively applied them to the RPC model refinement, which can well correct the parameter error of the RPC model. The refinement method can be used in combination with the satellite image rectification method (such as the image coordinate compensation model and the terrain coordinate compensation model), so that can be used for an arbitrary DSM generation pipeline of satellite images.

In our previous study, we proposed a GAN-based satellite image enhancement method for DSM generation [33]. Since we propose an RPC refinement method in this article, the method of [33] can be used in combination with the method proposed in this article.

The rest of this paper is organized as below. In Section 2, we provide a brief introduction to the key model and technique in the proposed method. In Section 3, we elaborate on the process and details of the proposed method. In Section 4, the performance of the proposed method is compared with other DSM generation methods, and the experimental results are analyzed comprehensively. Some useful conclusions are drawn in Section 5.

2. RPC Model and Bundle Adjustment

As a projection model commonly used in commercial satellite imagery, the RPC model is very important for DSM generation from satellite images. The proposed DSM generation method refines the parameters of the RPC model based on an improved bundle adjustment optimization algorithm, so this section will briefly introduce the principles of the RPC model and the bundle adjustment optimization algorithm.

2.1. RPC Model

With the commercialization of high-resolution satellite imagery, satellite companies have gradually replaced physical sensor models with RPC models in recent years for reasons of technical secrecy. The RPC model can directly map the relationship between image points and 3D points (including projection and localization). The projection in the RPC model can be represented as:

\begin{matrix} (x, y) = P (X, Y, Z) = (\frac{a (X, Y, Z)}{b (X, Y, Z)}, \frac{c (X, Y, Z)}{d (X, Y, Z)}), \end{matrix}

(1)

where

(x, y)

and

(X, Y, Z)

represent the coordinates of the image point and the 3D point, respectively,

P (\cdot)

denotes satellite camera projection. The coordinates

(x, y)

and

(X, Y, Z)

here are all normalized to the range of [−1, 1]. There are usually 10 normalization parameters in the RPC model. In addition,

a (\cdot), b (\cdot), c (\cdot), d (\cdot)

represent the polynomials in the RPC model, which have the following form:

\begin{matrix} a (X, Y, Z) & = a_{0} + a_{1} Y + a_{2} X + a_{3} Z + a_{4} Y X + a_{5} Y Z + a_{6} X Z + a_{7} Y^{2} + a_{8} X^{2} \\ + a_{9} Z^{2} + a_{10} X Y Z + a_{11} Y^{3} + a_{12} Y X^{2} + a_{13} {Y Z}^{2} + a_{14} Y^{2} X \\ + a_{15} X^{3} + a_{16} X Z^{2} + a_{17} Y^{2} Z + a_{18} X^{2} Z + a_{19} Z^{3} . \end{matrix}

(2)

The forms of

b (\cdot), c (\cdot), d (\cdot)

are the same as

a (\cdot)

, which means the RPC model has 80 rational function coefficients.

The RPC parameters are provided along with satellite images by image vendors. As an alternative to the physical sensor model, the generic RPC model describes satellite imaging in the fractional form of polynomials. This means that the RPC model is independent of the photographic platform and sensors. Nevertheless, the RPC model has limitations, such as the fact that the RPC parameters are physically uninterpretable. Therefore, the generic RPC model uses a mathematical model to describe the mapping relationship between image points and 3D points. Note that different satellite images have different parameters of the RPC model, which means that the RPC parameters have certain inaccuracies. These inaccuracies come from the errors caused by the sensors that record the attitude of the satellite. Besides, the inaccuracies in the satellite imagery can lead the image point to shift by tens of pixels [10].

2.2. Bundle Adjustment Optimization

Bundle adjustment, which is a computer vision technique used to jointly optimize camera parameters and 3D points, is essentially a nonlinear least-square optimization method, whose objective function is the sum of the squares of the reprojection errors (w.r.t.

l_{2}

-norm) of all feature points in all images. Assuming that there are

N

3D points

{\{(X_{n}, Y_{n}, Z_{n})\}}_{n = 1,2, \dots N}

, and for

M

cameras, there are

M \times N

2D observations

{\{(x_{m n}, y_{m n})\}}_{m = 1,2, \dots M; n = 1,2, \dots N}

. Actually, the number of

N

in the experiment is the same as the number of matching point pairs or matching point sets. The objective function of the bundle adjustment can be expressed as follows:

\begin{matrix} \min_{(X, P)} \frac{1}{2} \sum_{n = 1}^{N} \sum_{m = 1}^{M} {‖x_{m n} - P_{m} (X_{n})‖}_{2}^{2}, \end{matrix}

(3)

where

x_{m n} = (x_{m n}, y_{m n})

represents 2D observations in the satellite images, and

X_{n} = (X_{n}, Y_{n}, Z_{n})

represents 3D points. In Equation (3),

P_{m} (\cdot)

denotes the projection functions of the satellite cameras, which are in the forms of Equations (1) and (2). Here

P_{m} (X_{n})

represents the reprojected point from the

n

-th 3D point projected by the

m

-th camera. The reprojection error is denoted by

x_{m n} - P_{m} (X_{n})

. Bundle adjustment solves for the optimization variable with the smallest reprojection error in the sense of Equation (3).

As pointed out in [34], the gradient descent method, the Gauss-Newton method, and the Levenberg-Marquardt algorithm are commonly used to solve bundle adjustment optimization problems. Among the three methods, the gradient descent method is greedy and tends to increase iterations; the approximate Hessian matrix in the Gauss-Newton method may be degenerate, which leads to the non-convergence of the algorithm; the Levenberg-Marquardt algorithm is more robust compared to the Gauss-Newton method, but sometimes it converges very slowly. In this paper, the proposed method adopts an improved Levenberg-Marquardt algorithm based on trust regions, which is suitable for solving large-scale optimization problems.

3. The Proposed Satellite Images DSM Generation Method

In this paper, we propose a DSM generation method from satellite images based on the DPBA optimization algorithm, which is not only suitable for satellite image pairs but also for multi-date satellite images. The block diagram of the proposed method is shown in Figure 1.

The input of the proposed method consists of satellite images and corresponding RPC files. In Figure 1, different types of images have different processing methods. For the satellite image pair, the proposed method first directly matches the features extracted from the image pair, then refines the RPC parameters with the DPBA algorithm, and finally implements the stereo reconstruction with the improved RPC files and the feature matching results to obtain the 3D point cloud and DSM. For multi-date satellite images, the image pair selection is implemented before feature matching, and the 3D point cloud is aligned and fused after stereo reconstruction to generate the final 3D point cloud and DSM.

3.1. Feature Extraction and Matching

The proposed method adopts Scale Invariant Feature Transform (SIFT) [35] to extract key points from the satellite images. SIFT descriptors have invariant characteristics to both rotation and scale, and they are robust to noise and the change of viewing angle and illumination. Also, SIFT descriptors consist of locations, scales, orientations, and feature vectors, where the feature vector is a 128-dimension vector that records the detailed gradient information of the local region at the key point. Therefore, the number of SIFT descriptors has a great impact on the accuracy and speed of the matching process. The more descriptors are extracted, the more features are matched, but the feature extraction and matching time will increase, as shown in Table 1. In order to meet the accuracy requirements as much as possible, in the proposed method, the threshold value of the key point extraction is set to 30,000.

After extracting the SIFT descriptors, the Nearest Neighbor Search (NNS) method is used to find the points that best matched the key points. The matching method of SIFT descriptors is usually performed using Euclidean distance. When using the NNS method to match SIFT descriptors, the nearest neighbor point and the second nearest neighbor point of each key point are found by the K-dimension tree algorithm [36]. Then the Euclidean distance between the key point and the nearest neighbor point and the second nearest neighbor point are calculated. If the ratio of Euclidean distance between the key point and the nearest neighbor point to Euclidean distance between the key point and the second nearest neighbor point is lower than the threshold (the value of the threshold is generally set to [0.5, 0.7]), the matching is considered to be good.

3.2. Stereo Pairs Selection

In the proposed method, the stereo pair selection is needed for multi-date satellite images, and this process is to ensure that the reconstruction can adopt the multi-view stereo method. The quality of the DSM generated from different image pairs in multi-date satellite images varies widely, as shown in Figure 2. The intersection angles of image pair A and image pair B are 1.3° and 5.9°, respectively. More acquisition information will be introduced in Section 4.

There are four main strategies for stereo pair selection, including ground-truth-based strategy, learning-based strategy, SIFT-based strategy, and experience-based strategy. The ground-truth-based strategy calculates the DSM of all possible pairs and lists them according to the quality evaluation results. Although this strategy yields 3D points with high accuracy, the ground-truth-based strategy is impractical, especially when DSM truth is missing (e.g., emergency cartography). The learning-based strategy uses a learning algorithm to train the meta-data of satellite images (such as intersection angle, sun azimuth angle, sun elevation angle, etc.) to obtain an image pair selection model [21]. In order to obtain the DSM with high accuracy, this strategy needs to train a large amount of image parameter data. The SIFT-based strategy sorts image pairs based on the number of matching points. This strategy is simple, but the accuracy of the generated DSM is poor [37]. The experience-based strategy sorts pairs according to their intersection angle and the acquisition date [20], which is very simple but still does not take all the factors into account.

3.3. RPC Refinement by DPBA

In this section, we use the DPBA method to reduce the RPC error caused by the satellite attitude. DPBA is an improved bundle adjustment algorithm, which is mainly used to improve the accuracy of the generated DSM and accelerate the slow convergence speed of the traditional BA method.

In the RPC refinement process, 3D points are generated by triangulation on the matching point pairs or matching point sets generated in Section 3.1. Then DPBA is performed based on the 2D-3D point correspondence. Due to the large number of RPC parameters (about 90 parameters per satellite image), optimizing directly RPC parameters will cause heavily computational burden, so the proposed method chooses some satellite camera attitude parameters and 3D point coordinates as the optimization variables of DPBA. The attitude parameters of the satellite camera include rotation transformation

R

and translation transformation

t

. In fact, the attitude error of the satellite is mainly related to the attitude angle that defines the orientation of the sensor [32]. This error can be compensated by a rotational transformation. Therefore, the objective function of DPBA can be expressed by Equation (4).

\begin{matrix} \min_{(X_{n}, R_{m})} \frac{1}{2} \sum_{m = 1}^{M} \{\frac{1}{2} \sum_{n = 1}^{N} [{‖x_{m n} - P_{m} ({R_{m} X}_{n} + t_{m})‖}_{2}^{2} + {P E}_{1}] + {P E}_{2}\}, \end{matrix}

(4)

where

x_{m n}

,

X_{n}

,

P_{m} (\cdot)

,

M

and

N

represent the same meaning as in Equation (3). In addition,

R_{m}

and $t_{m}$ indicate rotation and translation of the

m

-th camera, respectively.

{P E}_{1}

and

{P E}_{2}

are the two penalty functions in the form shown in Equation (5):

\begin{matrix} \{\begin{matrix} {P E}_{1} = \frac{λ}{2} {‖X_{n}‖}_{2}^{2} \\ {P E}_{2} = \frac{τ}{2} {‖R_{m}‖}_{2}^{2} \end{matrix}, \end{matrix}

(5)

where

λ

and

τ

stand for the penalty parameters. In order to keep the values of the two penalty functions and the value of the original function approximately the same order of magnitude, we take

λ

0.0000001 and

τ

100,000. In view of the fact that the optimization variables

X_{n}

and

R_{m}

in Equation (5) represent different physical meanings, and the magnitude difference between them is quite large, we design the penalty functions as two types. In Equation (5), the 3D point

X_{n}

, an optimization variable related to the number of matching points, is designed as the first penalty function. The rotation matrix

R_{m}

represents the optimization variable related to the camera attitude and is designed into the second penalty function. Following the form of the ridge estimation, we still design the penalty functions in the form of

l_{2}

-norm.

To optimize efficiently, we use Euler angles to represent rotation transformations in the camera’s attitude parameters. Hence, it

R_{m}

can be expressed as follows:

\begin{matrix} R_{m} & = R_{X} (α_{m}) R_{Y} (β_{m}) R_{Z} (γ_{m}) \\ \begin{matrix} = (\begin{matrix} 1 & 0 & 0 \\ 0 & \cos α_{m} & - \sin α_{m} \\ 0 & \sin α_{m} & \cos α_{m} \end{matrix}) (\begin{matrix} \cos β_{m} & 0 & \sin β_{m} \\ 0 & 1 & 0 \\ - \sin β_{m} & 0 & \cos β_{m} \end{matrix}) (\begin{matrix} \cos γ_{m} & - \sin γ_{m} & 0 \\ \sin γ_{m} & \cos γ_{m} & 0 \\ 0 & 0 & 1 \end{matrix}), \end{matrix} \end{matrix}

(6)

where

R_{X} (\cdot)

,

R_{Y} (\cdot),

and

R_{Z} (\cdot)

represent the rotation on the X-axis, Y-axis, and Z-axis, respectively, and

α_{m}

,

β_{m}

and

γ_{m}

denote the Euler angles to be optimized of the

m

-th camera. The subsequent calculation of the refinement RPC parameters adopts the method in [32].

3.4. Stereo Reconstruction

Stereo reconstruction includes image rectification and stereo matching. The image rectification process transforms the satellite images in order to reduce the computational burden of stereo matching. Stereo matching generally uses dense matching methods combined with triangulation to generate 3D point clouds.

The current satellite images are all acquired by pushbroom cameras. In fact, the epipolar line that satisfies the epipolar constraint on the stereo pair of the pushbroom images is not a straight line but a curve [6]. However, for a small Area of Interest (AOI), the epipolar curve of the pushbroom images can be approximated as a group of parallel straight lines [8]. Therefore, the stereo rectification on the pushbroom images can be generated by the rigid 2D transformation of the images. If the processed satellite image tile is small (1000 × 1000 pixels), it can be rectified by affine transformation [7]. The proposed method designs two steps for stereo rectification of satellite images, as shown in Figure 3. Firstly, the affine transformation is calculated according to the point correspondences generated in Section 3.1, and then the affine transformation is performed on the image.

After rectification, the stereo pairs also need to be stereo matched. Semi-Global Matching (SGM) is commonly used for binocular stereovision systems. In the SGM algorithm, the parallax map of the stereo pair is obtained by minimizing the cost function of the dynamic programming. Then the stereo matching process combines the parallax map with the refined RPC model to triangulate. Finally, the 3D point cloud and the DSM can be generated. In order to improve the efficiency of the algorithm, the proposed method uses an improved SGM algorithm to perform dense matching [7,10]. We adopt Census cost [38] for SGM, and replace the one direction of the dynamic programming with the mean of the parallax information in the two vertical directions.

3.5. Alignment and Fusion

Since the proposed method adopts the stereo-pair-based multi-view stereo method for DSM generation, multiple 3D point clouds of AOI will be obtained when it performs DSM generation on multi-date satellite images. In order to improve the accuracy of the 3D models, multiple 3D point clouds need to be aligned and fused. In this process, we project the 3D point clouds on a geographic grid and align the 3D point clouds according to the geographic coordinate, then we average the pixels at the same location.

4. Experimental Results and Discussion

In this section, we verify the effectiveness of the proposed method on a benchmark dataset. We also analyze the experimental results of satellite image pairs and multi-date satellite images, respectively. We use Python programming in Linux to implement our method, and the main dependency of our code is the Geospatial Data Abstraction Library (GDAL).

4.1. Dataset and Metrics

The proposed method is tested on the public benchmark dataset IARPA Multi-View Stereo 3D Mapping (MVS3DM) Challenge [39]. The dataset contains 50 panchromatic images captured by the WorldView-3 satellite with 30 cm nadir resolution, covering an area of 100 km² near San Fernando, Argentina. These satellite images were acquired from November 2014 to January 2016. The dataset also provides the airborne lidar data with 20 cm nadir resolution that can be used as the ground truth DSM. We selected 4 sites for the experiments of the multi-date satellite images, including low-rise buildings, medium-tall buildings, and parks, which are rich in building types and can fully verify the effectiveness of the proposed method. Among these sites, the geographic range of Site 1 is 0.9 × 0.9 km², and each of Site 2, Site 3, and Site 4 is 0.6 × 0.6 km². In addition, we also selected 10 image pairs from the multi-date satellite images to test the DSM generation performance of the proposed method on satellite image pairs.

For results evaluation, we follow the previous quality evaluation metrics [25,26,27]: completeness (CP) and median error (ME). Among them, CP is defined as the percentage of the altitude errors less than 1 m in a 3D point cloud, and ME stands for median elevation error, which is generated by aligning the generated DSM with the ground truth DSM and comparing them pixel-wise.

4.2. Performance Analysis on Satellite Image Pair

In this section, we select 10 satellite image pairs from the IARPA MVS3DM dataset to test the proposed method and the effect of the acquisition parameters on DSM generation. The input satellite image and the ground truth DSM are shown in Figure 4. Due to the limited space of the article, we only show one of the input satellite images. As for Figure 4b, we draw the ground truth DSM as an RGB image for visualization. The altitude range is 18–35 m, and the areas with lower altitudes are indicated in blue and the areas with higher altitudes are indicated in red. The geographic range of the input satellite images is slightly larger than that of the ground truth DSM, which is in order to ensure that the geographic range of the generated DSM can fully contain that of the ground truth DSM.

The acquisition parameters of the satellite image pairs are listed in Table 2. We choose the acquisition interval, sun angle difference, and intersection angle as the main parameters affecting DSM generation.

The acquisition interval has a great influence on the surface buildings and vegetation of the satellite images [20]. In our experiment, the acquisition interval ranges from 1 day to 398 days, which meets the minimum and maximum intervals of the IARPA MVS3DM dataset. Here the sun angle difference is defined as the root mean square value of the difference of the sun azimuth angle and the difference of the sun elevation angle, as shown in Equation (7).

\begin{matrix} θ_{s d} = \sqrt{\frac{θ_{s a d}^{2} + θ_{s e d}^{2}}{2}}, \end{matrix}

(7)

where

θ_{s d}

denotes the sun angle difference,

θ_{s a d}

represents the difference of the sun azimuth angle, and

θ_{s e d}

stands for the difference of the sun elevation angle. The sun angle difference affects the imaging of the shadow area and specular reflection area of the ground [21]. In this paper, we test the sun angle difference from 0.07° to 26.83°. Excessive sun angle difference will reduce the number of matched pixels in stereo matching, resulting in the poor-quality DSM, as shown in Figure 5a. The intersection angle of the image pair is a traditional factor that affects stereo matching. We test the intersection angle from 0.4° to 16.8°. Experimental results with an intersection angle larger than 20° are not good. The DSM generated by image pairs with an intersection angle of more than 20° is shown in Figure 5b.

The generated DSMs of image pair 1 to image pair 10 are shown in Figure 6. In the experiment, we chose the S2P method [7] to compare with the proposed method. The S2P method is a DSM generation method of satellite images based on geometric stereo rectification, which is one of the methods with the highest accuracy in generating point clouds in recent years, and the s2p 1.0b25 software can be found in [40]. Here we also use the traditional BA method [32] as a comparison method.

In order to be consistent with the ground truth in Figure 4, we still represent the generated DSMs as RGB images in Figure 6, and the altitude range is still 18–35 m. It can be seen from the first column of Figure 6 that there are many holes in the most DSMs generated by the S2P method, and their visual quality is poor. As for image pairs 4, 8, and 10, although the S2P method generates fewer holes in the DSM, it can be seen that the altitude error of the DSM is large according to the image color. It can be seen that the color of DSM generated by image pair 10 is closer to the ground truth than image pair 4 and image pair 8. The altitude errors of the DSMs corresponding to image pairs 4, 8, and 10 are about 3 m, 2 m, and 1 m, respectively, which also proves the previous inference. However, the DSM generated from image pair 5 is better because the S2P method relies heavily on the results of geometric stereo rectification. When the stereo rectification result is poor, it will lead to more mismatches in the subsequent stereo matching process, resulting in poor accuracy of the generated DSMs.

As shown in the second and third columns of Figure 6, the altitude information of the DSMs generated by the traditional BA method is similar to that of the proposed method. The generated DSMs of the proposed method and the traditional BA method are closer to the ground truth DSM. As can be seen from the ten sets of DSMs, the DSM generated by image pair 4 appears green at some roads on the image, which is different from the blue color of the ground truth DSM, so the altitude error here is large. This may be due to the large intersection angle of image pair 4. In addition, the DSMs generated by image pair 6 and image pair 7 have more holes, which is due to the fact that there are fewer matching points, and the deeper reason is that the sun angle differences of these two image pairs are too large (more than 20°). It can be seen that the visual quality of the DSMs generated by image pairs 8, 9, and 10 is better than that generated by image pairs 6 and 7, which indicates that even if the acquisition interval is large, as long as the sun angle difference and the intersection angle of the image pair are appropriate, the DSMs with better visual quality can be generated.

In general, the DSMs generated by the proposed method appear to be very similar to the traditional BA method in Figure 6. However, in fact they are slightly different, and it is more obvious at the intersection in the upper right corner of image pair 9, where the 3D points generated by the proposed method are denser. The details of image pair 9 are shown in Figure 7. As for the comparison between the proposed method and the traditional BA method, it is necessary to further calculate the quality evaluation of the generated DSMs.

The results of the quality evaluation experiments are listed in Table 3. The metrics in bold are the best of the three methods. According to the metrics in Section 4.1, we hope that the CP of the generation result is as high as possible, and a higher CP means more good points (points with an altitude accuracy better than 1 m) in the generated DSM. At the same time, we also hope that the ME is as low as possible, and lower ME means higher accuracy of the generated DSM.

In Table 3, the proposed method has a high CP of more than 50% for most of the image pairs. Among the first five groups of DSMs, the DSM generated by image pair 4 has a slightly lower CP due to the larger intersection angle, which is consistent with the visual quality experiments in Figure 6. The DSM generated by image pair 6 and image pair 7 have lower CP, and this is because of the large sun angle difference. Moreover, the DSMs generated by image pairs 8–10 also have a little bit lower CP on account of the large acquisition interval of the image pair. However, the CP of the DSM generated by image pair 9 and image pair 10 is better than that of image pair 8. Because the acquisition interval of image pair 8 is 152 days, and the two satellite images of image pair 8 are acquired in autumn and spring, respectively, which will cause large differences in surface vegetation, thus affecting the stereo matching step. Although the acquisition interval of image pair 9 and image pair 10 is more than 300 days, the satellite images are all collected in winter, so the land surface does not change much. Therefore, even if the acquisition interval reaches 310 days, as long as the surface changes little, the sun angle difference, and the intersection angle of the image pair are not large, the proposed method can still achieve good generation results. For the DSMs generated by the S2P method, most of the CP values are poor, while the values generated by the traditional BA method are better. However, in general, the CP value of the proposed method is 0.6%–3.0% higher than that of the traditional BA method in most cases.

We also compare the ME of the DSM altitude, which reflects the accuracy of the 50% generated 3D points in DSM. In Table 3, the ME value of the DSM generated by the proposed method is from 0.71 m to 1.93 m, which is 0.4–6.44 m lower than that of the S2P method and 0.01–0.06 m lower than that of the traditional BA method in most cases. Therefore, for most image pairs, the proposed method is better and more robust than the S2P method and the traditional BA method.

From Table 2 and Table 3, we easily see that a large acquisition interval brings more uncertainty about surface buildings and vegetation, and it is difficult to generate a good DSM for a large sun angle difference or large intersection angle. According to Table 3, the CP of the DSMs generated by the proposed method for the first three image pairs is more than 60%, and the ME is less than 1.27 m. Therefore, the proposed method can obtain better-generated results for image pairs with small differences in acquisition intervals, sun angle difference, and intersection angle. In general, when the acquisition interval is less than 15 days, the sun angle difference is less than 20°, and the intersection angle is 3°−15°, the proposed method can generate a good DSM.

4.3. Performance Analysis on Multi-Date Satellite Images

We selected 4 sites to test the performance of the proposed method on multi-date satellite images, where Site 1 is the same as the site in Section 4.2. The input images of Site 2, Site 3, and Site 4 are shown in Figure 8, and the corresponding ground truth DSMs are shown in the first column of Figure 9. Site 1 is a mixed area of low-to-medium buildings and roads, with an altitude range of 18–35 m. Site 2 is a medium-high building area with an altitude range of 18–42 m, and we still use the red color to indicate the high altitude. Site 3 is a low-rise building area with an altitude range of 18–35 m, and it can be seen that most of the buildings in this area are blue (blue color indicates low altitude). Site 4 is the park area, where there is some water surface, so the ground truth DSM has significant holes in the water surface area.

According to the analysis in Section 4.2, we selected 10 image pairs with the acquisition interval of less than 15 days, the sun angle difference of less than 20°, and the intersection angle of 3°–15° to generate DSMs, and then the 10 DSMs are aligned and fused. We chose the multi-date S2P method [20] and the traditional BA method [32] to compare with the proposed method. The multi-date S2P method is one of the best methods for automatic DSM generation of multi-date satellite images [21,25]. Different from the S2P method, which only deals with a single satellite image pair, the multi-date S2P method uses the experience-based image pair selection strategy to select the multi-date satellite images and registers and fuses the 3D point clouds after generating multiple independent point clouds.

The final generated DSMs are shown in Figure 9. The details of the DSMs are shown in Figure 10. It can be seen from Figure 9 and Figure 10 that the details of the DSM generated by the multi-date S2P method in Site 1 are poor, and the DSMs generated by the proposed method and the traditional BA method have better visual quality. For Site 2, the DSM generated by the multi-date S2P method has a great number of 3D points with large errors (about 8 m) at the edge of the high buildings. The altitude estimation of the DSM generated by the traditional BA method in the high building area is obviously lower this area is orange instead of red, and the error of this area is about 4 m. The visual quality of the DSM generated by the proposed method is significantly better than that of the other two methods, but there is also a problem that the edge of the building is not clear, which is due to the difference of the shadows and vegetation around the building in the image pair composed of satellite images on different dates. This issue shall be considered in the future study.

As shown in the third row of Figure 9, the visual quality of the DSMs generated by the multi-date S2P method and the proposed method in the Site 3 area is good, but the overall altitude of the DSM generated by the traditional BA method is 3.2 m higher (the DSM in Figure 9 is mint green instead of blue) than the ground true DSM, which may be due to the fact that the objective function of the traditional BA method falls into a local optimal value when solving the optimization problem, and the local optimal value is still large. For the Site 4 area, the altitude error of the DSM generated by the multi-date S2P method in the high building area is large (the overall altitude value here is about 1.5 m higher than the ground truth DSM), and the building edge of the DSM generated by the traditional BA method is not as clear as that of the DSM generated by the proposed method. In summary, the proposed method achieves good visual quality in the four sites.

The quality evaluation results of DSMs generated from multi-date satellite images are shown in Table 4. It can be seen that the proposed method has a high CP (over 60%) of the DSMs from the four sites, which is consistent with the visual quality results presented in Figure 9. For Site 4, the CP is lower for all three methods, due to the presence of too many vegetation areas in Site 4 and the large variation of vegetation areas in satellite images from different acquisition times. On the whole, the proposed method achieves more than 60% completeness for the DSM generated from multi-date satellite images and performs well. At the same time, the CP value of DSM generated by the multi-date S2P method can be as low as 50.49%, while that of the traditional BA method is as low as 52.71%. In terms of accuracy, the ME values of the proposed methods are all below 0.7 m. Nevertheless, the ME values of the multi-date S2P method can be as high as 0.99 m, and the traditional BA method is as high as 0.94 m. Based on Figure 9 and Figure 10 and Table 4, the proposed method is more accurate in estimating the altitude of the building area and is significantly more robust than the multi-date S2P method and the traditional BA method.

In addition, we compare the optimization performance of the proposed method with the traditional BA method, and the number of iterations and optimization time of the two methods are listed in Table 5. It can be seen that regardless of the number of iterations, the optimization time of the proposed method is less than that of the traditional BA method, which indicates that the proposed method can find the optimal solution faster and has better optimization performance. Therefore, the proposed method provides a possibility to solve the slow convergence problem of the traditional BA method.

5. Conclusions

In this paper, we propose a DSM generation method for satellite images based on a double-penalty bundle adjustment optimization algorithm. The proposed method can automatically generate 3D point clouds and DSM without ground control points. The DSM generated by the proposed method has a good visual quality in the building area, when the acquisition interval is less than 15 days, the sun angle difference is less than 20º, and the intersection angle is from 3° to 15°. The experimental results show that the completeness and the accuracy of the DSMs generated by the proposed method are better than those of the DSMs generated by the traditional BA method and the S2P method. The visual quality experiment and the quality evaluation experiment demonstrate the effectiveness and the robustness of the proposed method. However, for satellite image pairs with the sun angle difference larger than 20°, the intersection angle larger than 15°, or an area with more vegetation, the DSMs generated by the proposed method need to be improved. This is what we’re going to research in the future work.

Author Contributions

Conceptualization, H.L. and J.Y.; methodology, H.L. and L.J.; software, H.L.; validation, H.L. and J.Y.; formal analysis, H.L., L.J. and J.Y.; investigation, H.L. and L.J.; resources, H.L.; data curation, H.L.; writing—original draft preparation, H.L.; writing—review and editing, H.L. and L.J.; visualization, H.L. and L.J.; supervision, J.Y.; project administration, J.Y.; funding acquisition, H.L. and J.Y. All authors have read and agreed to the published version of the manuscript.

Funding

The work has been supported by Major Program of National Natural Science Foundation of China NSFC (nos.12292980, 12292984); by National Key R\D Program of China (nos. 2023YFA1009000, 2023YFA1009004, 2020YFA0712203, 2020YFA0712201); by Key Projects of National Natural Science Foundation of China NSFC (no. 12031016); by Beijing Natural Science Foundation (no. BNSF-Z210003); and by the Department of Science, Technology, and Information of the Ministry of Education (no. 8091B042240); by the Fundamental Research Funds for the Central Universities (no. 2412022QD024).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zubkov, P.; Solberg, S.; McInnes, H. Windthrow damage detection in Nordic forests by 3D reconstruction of very high-resolution stereo optical satellite imagery. Int. J. Remote Sens. 2023, 44, 4963–4988. [Google Scholar] [CrossRef]
Zhao, L.; Wang, H.; Zhu, Y.; Song, M. A review of 3D reconstruction from high-resolution urban satellite images. Int. J. Remote Sens. 2023, 44, 713–748. [Google Scholar] [CrossRef]
Xie, S.; Zhang, L.; Jeon, G.; Yang, X. Remote sensing neural radiance fields for multi-view satellite photogrammetry. Remote Sens. 2023, 15, 3808. [Google Scholar] [CrossRef]
Liao, P.; Chen, G.; Zhang, X.; Zhu, K.; Gong, Y.; Wang, T.; Li, X.; Yang, H. A linear pushbroom satellite image epipolar resampling method for digital surface model generation. ISPRS J. Photogramm. 2022, 190, 56–68. [Google Scholar] [CrossRef]
Wei, K.; Huang, X.; Li, H. Stereo matching method for remote sensing images based on attention and scale fusion. Remote Sens. 2024, 16, 387. [Google Scholar] [CrossRef]
Shi, J.; Yuan, X.; Cai, Y.; Wang, G. GPS real-time precise point positioning for aerial triangulation. GPS Solut. 2017, 21, 405–414. [Google Scholar] [CrossRef]
De Franchis, C.; Meinhardt-Llopis, E.; Michel, J.; Morel, J.-M.; Facciolo, G. Automatic sensor orientation refinement of Pléiades stereo images. In Proceedings of the IEEE Geoscience and Remote Sensing Symposium, Quebec City, QC, Canada, 6 November 2014. [Google Scholar]
De Franchis, C.; Meinhardt-Llopis, E.; Michel, J.; Morel, J.M.; Facciolo, G. An automatic and modular stereo pipeline for pushbroom images. In Proceedings of the ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Zürich, Switzerland, 5–7 September 2014. [Google Scholar]
De Franchis, C.; Meinhardt-Llopis, E.; Michel, J.; Morel, J.M.; Facciolo, G. On stereo-rectification of pushbroom images. In Proceedings of the 2014 IEEE International Conference on Image Processing, Paris, France, 27–30 October 2014. [Google Scholar]
Qin, R. RPC Stereo Processor (RSP)—A software package for digital surface model and orthophoto generation from satellite stereo imagery. ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci. 2016, 3, 77–82. [Google Scholar] [CrossRef]
Qin, R.; Song, S.; Ling, X.; Elhashash, M. 3D reconstruction through fusion of cross-view images. In Recent Advances in Image Restoration with Applications to Real World Problems; Intechopen: London, UK, 2020. [Google Scholar]
Gómez, A.; Randall, G.; Facciolo, G.; von Gioi, R.G. An experimental comparison of multi-view stereo approaches on satellite images. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2022. [Google Scholar]
Marí, R.; Facciolo, G.; Ehret, T. Sat-NeRF: Learning multi-view satellite photogrammetry with transient objects and shadow modeling using RPC cameras. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, New Orleans, LA, USA, 19–20 June 2022. [Google Scholar]
Qu, Y.; Deng, F. Sat-mesh: Learning neural implicit surfaces for multi-view satellite reconstruction. Remote Sens. 2023, 15, 4297. [Google Scholar] [CrossRef]
Song, S.; Morelli, L.; Wu, X.; Qin, R.; Albanwan, H.; Remondino, F. Evaluating Learning-based Tie Point Matching for Geometric Processing of Off-Track Satellite Stereo. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 2024, 48, 393–400. [Google Scholar] [CrossRef]
Michel, J.; Sarrazin, E.; Youssefi, D.; Cournet, M.; Buffe, F.; Delvit, J.M.; Emilien, A.; Bosman, J.; Melet, O.; L’Helguen, C. A new satellite imagery stereo pipeline designed for scalability, robustness and performance. ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci. 2020, 2, 171–178. [Google Scholar] [CrossRef]
Patil, S.; Guo, Q. Stellar: A Large Satellite Stereo Dataset for Digital Surface Model Generation. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 2023, 48, 433–440. [Google Scholar] [CrossRef]
Noh, M.J.; Howat, I.M. Analysis of PlanetScope Dove Digital Surface Model Accuracy Using Geometrically Simulated Images. Remote Sens. 2023, 15, 3496. [Google Scholar] [CrossRef]
d’Angelo, P.; Reinartz, P. Digital elevation models from stereo, video and multi-view imagery captured by small satellites. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 2021, 43, 77–82. [Google Scholar] [CrossRef]
Facciolo, G.; De Franchis, C.; Meinhardt-Llopis, E. Automatic 3D reconstruction from multi-date satellite images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
Qin, R. A critical analysis of satellite stereo pairs for digital surface model generation and a matching quality prediction model. ISPRS J. Photogramm. 2019, 154, 139–150. [Google Scholar] [CrossRef]
Qin, R.; Ling, X.; Farella, E.M.; Remondino, F. Uncertainty-guided depth fusion from multi-view satellite images to improve the accuracy in large-scale DSM generation. Remote Sens. 2022, 14, 1309. [Google Scholar] [CrossRef]
Pollard, T.B.; Eden, I.; Mundy, J.L.; Cooper, D.B. A volumetric approach to change detection in satellite images. Photogramm. Eng. Remote Sens. 2010, 76, 817–831. [Google Scholar] [CrossRef]
Ozcanli, O.C.; Dong, Y.; Mundy, J.L.; Webb, H.; Hammoud, R.; Victor, T. Automatic geo-location correction of satellite imagery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Columbus, OH, USA, 23–28 June 2014. [Google Scholar]
Zhang, K.; Snavely, N.; Sun, J. Leveraging vision reconstruction pipelines for satellite imagery. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Republic of Korea, 27–28 October 2019. [Google Scholar]
Bullinger, S.; Bodensteiner, C.; Arens, M. 3D surface reconstruction from multi-date satellite images. arXiv 2021, arXiv:2102.02502. [Google Scholar] [CrossRef]
Wang, P.; Shi, L.; Chen, B.; Hu, Z.; Qiao, J.; Dong, Q. Pursuing 3-D scene structures with optical satellite images from affine reconstruction to Euclidean reconstruction. IEEE T. Geosci. Remote 2022, 60, 1–14. [Google Scholar] [CrossRef]
Surayuda, R.H.; Sirin, D.N.S.; Maryanto, A.; Setiyoko, A.; Bayanuddin, A.A. Systematic Geometric Correction Method of Stereo Image Data Proposes for the Next Indonesian Satellite. In Proceedings of the 2023 IEEE Asia-Pacific Conference on Geoscience, Electronics and Remote Sensing Technology, Surabaya, Indonesia, 11 April 2024. [Google Scholar]
Tong, X.; Liu, S.; Weng, Q. Bias-corrected rational polynomial coefficients for high accuracy geo-positioning of QuickBird stereo imagery. ISPRS J. Photogramm. 2010, 65, 218–226. [Google Scholar] [CrossRef]
Beyer, R.A.; Alexandrov, O.; McMichael, S. The Ames Stereo Pipeline: NASA’s open source software for deriving and processing terrain data. Earth Space Sci. 2018, 5, 537–548. [Google Scholar] [CrossRef]
Leotta, M.J.; Long, C.; Jacquet, B.; Zins, M.; Lipsa, D.; Shan, J.; Xu, B.; Li, Z.; Zhang, X.; Chang, S.F.; et al. Urban semantic 3D reconstruction from multiview satellite imagery. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–17 June 2019. [Google Scholar]
Marí, R.; De Franchis, C.; Meinhardt-Llopis, E.; Anger, J.; Facciolo, G. A generic bundle adjustment methodology for indirect RPC model refinement of satellite imagery. Image Process. Lin. 2021, 11, 344–373. [Google Scholar] [CrossRef]
Li, H.; Yin, J.; Jiao, L. An Improved 3D Reconstruction Method for Satellite Images Based on Generative Adversarial Network Image Enhancement. Appl. Sci. 2024, 14, 7177. [Google Scholar] [CrossRef]
Triggs, B.; McLauchlan, P.F.; Hartley, R.I.; Fitzgibbon, A.W. Bundle Adjustment—A Modern Synthesis. In Vision Algorithms: Theory and Practice. IWVA 1999; Lecture Notes in Computer Science; Triggs, B., Zisserman, A., Szeliski, R., Eds.; Springer: Berlin/Heidelberg, Germany, 2000; Volume 1883, pp. 298–372. [Google Scholar]
Lowe, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 2004, 60, 91–110. [Google Scholar] [CrossRef]
Bentley, J.L. Multidimensional binary search trees used for associative searching. Commun. ACM 1975, 18, 509–517. [Google Scholar] [CrossRef]
Marí, R.; De Franchis, C.; Meinhardt-Llopis, E.; Facciolo, G. To bundle adjust or not: A comparison of relative geolocation correction strategies for satellite multi-view stereo. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Republic of Korea, 27–28 October 2019. [Google Scholar]
Fife, W.S.; Archibald, J.K. Improved census transforms for resource-optimized stereo vision. IEEE T. Circ. Syst. Vid. 2012, 23, 60–73. [Google Scholar] [CrossRef]
Bosch, M.; Leichtman, A.; Chilcott, D.; Goldberg, H.; Brown, M. Metric evaluation pipeline for 3d modeling of urban scenes. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2017, 42, 239–246. [Google Scholar] [CrossRef]
GitHub—MISS3D/s2p. Available online: https://github.com/MISS3D/s2p (accessed on 27 June 2024).

Figure 1. Block diagram of the proposed DSM generation method.

Figure 2. The reconstruction results from different image pairs: (a) image pair A; (b) image pair B.

Figure 3. The stereo rectification of the proposed method.

Figure 4. The test data in the IARPA MVS3DM dataset: (a) the input satellite image; (b) the ground truth DSM.

Figure 5. The DSM generated from: (a) the satellite image pair with more than 30° sun angle difference; (b) the satellite image pair with more than 20° intersection angle.

Figure 6. The generated DSMs of the S2P method [7], the traditional BA method [32], and the proposed method.

Figure 7. The details of the generated DSMs from image pair 9.

Figure 8. The input satellite images of Site 2, Site 3, and Site 4.

Figure 9. The ground truth DSM and the generated DSMs of the multi-date S2P method [20], the traditional BA method [32], and the proposed method.

Figure 10. The ground truth DSM and the generated DSMs of the multi-date S2P method [20], the traditional BA method [32], and the proposed method.

Table 1. The matching points and matching time of SIFT descriptors.

Number of SIFT Descriptors	Number of Matched Points	Matching Time (s)
10,000	1256	2.39
20,000	2580	3.01
30,000	3214	3.49

Table 2. The acquisition parameters of the satellite image pairs.

Image Pair	Acquisition Interval	Sun Angle Difference	Intersection Angle
1	1 day	0.07°	5.9°
2	1 day	0.16°	9.2°
3	1 day	0.22°	3.1°
4	7 days	1.41°	16.8°
5	26 days	2.34°	8.0°
6	76 days	22.29°	1.3°
7	95 days	26.83°	0.4°
8	152 days	4.10°	3.6°
9	310 days	9.31°	7.6°
10	398 days	6.13°	12.2°

Table 3. The quality evaluation results of DSMs generated from satellite image pairs.

Image Pair	Method	CP (%)	ME (m)
1	S2P [7]	1.37	5.40
	Traditional BA [32]	61.66	0.78
	Proposed method	63.02	0.75
2	S2P [7]	1.24	4.10
	Traditional BA [32]	70.31	0.74
	Proposed method	71.18	0.70
3	S2P [7]	2.78	2.18
	Traditional BA [32]	71.36	0.74
	Proposed method	71.90	0.71
4	S2P [7]	5.25	2.89
	Traditional BA [32]	57.27	0.85
	Proposed method	58.09	0.83
5	S2P [7]	72.75	0.47
	Traditional BA [32]	58.65	0.88
	Proposed method	61.66	0.82
6	S2P [7]	3.12	5.87
	Traditional BA [32]	38.55	1.40
	Proposed method	38.50	1.40
7	S2P [7]	2.60	7.54
	Traditional BA [32]	28.72	1.94
	Proposed method	28.79	1.93
8	S2P [7]	13.80	2.09
	Traditional BA [32]	47.18	1.07
	Proposed method	48.59	1.04
9	S2P [7]	2.93	7.39
	Traditional BA [32]	52.32	0.97
	Proposed method	53.27	0.95
10	S2P [7]	34.01	1.35
	Traditional BA [32]	49.83	1.00
	Proposed method	52.36	0.95

Table 4. The quality evaluation results of DSMs generated from multi-date satellite images.

AOI	Method	CP (%)	ME (m)
Site 1	Multi-date S2P [20]	71.87	0.60
	Traditional BA [32]	71.99	0.68
	Proposed method	72.60	0.65
Site 2	Multi-date S2P [20]	65.67	0.42
	Traditional BA [32]	66.98	0.55
	Proposed method	66.91	0.47
Site 3	Multi-date S2P [20]	71.47	0.52
	Traditional BA [32]	61.58	0.78
	Proposed method	68.61	0.46
Site 4	Multi-date S2P [20]	50.49	0.99
	Traditional BA [32]	52.71	0.94
	Proposed method	62.66	0.66

Table 5. The optimization performance of the traditional BA method and the proposed method.

AOI	Method	Number of Iterations	Optimize Time (s)
Site 1	Traditional BA [32]	59	19.73
Site 1	Proposed method	56	18.21
Site 2	Traditional BA [32]	61	16.21
Site 2	Proposed method	63	15.91
Site 3	Traditional BA [32]	58	19.94
Site 3	Proposed method	65	19.72
Site 4	Traditional BA [32]	61	15.22
Site 4	Proposed method	62	14.68

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, H.; Yin, J.; Jiao, L. Digital Surface Model Generation from Satellite Images Based on Double-Penalty Bundle Adjustment Optimization. Appl. Sci. 2024, 14, 7777. https://doi.org/10.3390/app14177777

AMA Style

Li H, Yin J, Jiao L. Digital Surface Model Generation from Satellite Images Based on Double-Penalty Bundle Adjustment Optimization. Applied Sciences. 2024; 14(17):7777. https://doi.org/10.3390/app14177777

Chicago/Turabian Style

Li, Henan, Junping Yin, and Liguo Jiao. 2024. "Digital Surface Model Generation from Satellite Images Based on Double-Penalty Bundle Adjustment Optimization" Applied Sciences 14, no. 17: 7777. https://doi.org/10.3390/app14177777

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Digital Surface Model Generation from Satellite Images Based on Double-Penalty Bundle Adjustment Optimization

Abstract

1. Introduction

2. RPC Model and Bundle Adjustment

2.1. RPC Model

2.2. Bundle Adjustment Optimization

3. The Proposed Satellite Images DSM Generation Method

3.1. Feature Extraction and Matching

3.2. Stereo Pairs Selection

3.3. RPC Refinement by DPBA

3.4. Stereo Reconstruction

3.5. Alignment and Fusion

4. Experimental Results and Discussion

4.1. Dataset and Metrics

4.2. Performance Analysis on Satellite Image Pair

4.3. Performance Analysis on Multi-Date Satellite Images

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI