Next Article in Journal
Preface: Remote Sensing of Biodiversity
Previous Article in Journal
Estimation of Daily Solar Radiation Budget at Kilometer Resolution over the Tibetan Plateau by Integrating MODIS Data Products and a DEM
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Transformation Model with Constraints for High-Accuracy of 2D-3D Building Registration in Aerial Imagery

1
Guangxi Key Laboratory for Geospatial Informatics, Guilin University of Technology, Guilin 541004, China
2
The Center for Remote Sensing, Tianjin University, No. 92, Weijin Road, Nankai District, Tianjin 300072, China
3
Chinese Academy of Surveying and Mapping, 28 Lianhuachi West Road, Beijing 100830, China
4
School of Precision Instrument and Opto-Electronics Engineering, Tianjin University, No. 92, Weijin Road, Nankai District, Tianjin 300072, China
5
Department of Modeling, Simulation, and Visualization Engineering, Old Dominion University, Norfolk, VA 23529, USA
*
Author to whom correspondence should be addressed.
Remote Sens. 2016, 8(6), 507; https://doi.org/10.3390/rs8060507
Submission received: 27 January 2016 / Revised: 28 April 2016 / Accepted: 26 May 2016 / Published: 16 June 2016

Abstract

:
This paper proposes a novel rigorous transformation model for 2D-3D registration to address the difficult problem of obtaining a sufficient number of well-distributed ground control points (GCPs) in urban areas with tall buildings. The proposed model applies two types of geometric constraints, co-planarity and perpendicularity, to the conventional photogrammetric collinearity model. Both types of geometric information are directly obtained from geometric building structures, with which the geometric constraints are automatically created and combined into the conventional transformation model. A test field located in downtown Denver, Colorado, is used to evaluate the accuracy and reliability of the proposed method. The comparison analysis of the accuracy achieved by the proposed method and the conventional method is conducted. Experimental results demonstrated that: (1) the theoretical accuracy of the solved registration parameters can reach 0.47 pixels, whereas the other methods reach only 1.23 and 1.09 pixels; (2) the RMS values of 2D-3D registration achieved by the proposed model are only two pixels along the x and y directions, much smaller than the RMS values of the conventional model, which are approximately 10 pixels along the x and y directions. These results demonstrate that the proposed method is able to significantly improve the accuracy of 2D-3D registration with much fewer GCPs in urban areas with tall buildings.

Graphical Abstract

1. Introduction

Texturing three-dimensional (3D) buildings at a large scale for the generation of a photorealistic urban environment has been an important research topic in the computer vision [1,2,3,4], computer graphics [5] and remote sensing communities [6]. The prerequisite for texturing 3D buildings is high accuracy in the geometric alignment of 2D texture images, which are captured by different types of devices, such as airborne and spaceborne platforms, with their corresponding 3D models without textures, which are represented by various geometric data structures. This process is called 2D-3D registration. Researchers encountered various unique real-world problems and have developed a variety of approaches for 2D-3D registration.
A commonly-adopted transformation model for a given ground point G can be expressed by:
( X G Y G Z G ) = ( X S Y S Z S ) + R ( ω , ϕ , κ ) ( x g y g z g )
where (XG, YG, ZG)T are the coordinates of the point G in 3D space with respect to a geographic coordinate system, (XS, YS, ZS)T are the center of the perspective projection, which is the translation vector between the origins of the 2D image coordinate system and the geographic coordinate system, (xg, yg, zg)T are the image coordinates of the given ground point G with respect to the image coordinate system, φ, ω and k are three rotation angles and R(φ, ω, k) is a 3D orthogonal rotation matrix as follows,
R = ( cos ϕ cos k cos ω sin k + sin ω sin ϕ cos k sin ω sin k cos ω sin ϕ cos k cos ϕ sin k cos ω cos k sin ω sin ϕ sin k sin ω cos k + cos ω sin ϕ sin k sin ϕ sin ω cos ϕ cos ω cos ϕ )
Traditional 2D-3D registration methods solve for the camera’s position (XS, YS, ZS)T and its orientation matrix R in Equations (1) and (2). However, none of the distortion models are perfect. Any distortion model cannot completely describe the distortion over the entire image plane, i.e., the residual errors (incomplete correction) still remain no matter what kind of distortion model is adopted. As a result, the accuracy of image coordinates in the entire image plane is asymmetric (see Figure 1). This phenomenon has been demonstrated in [7] and it becomes obvious for a big dimension of aerial images, especially in which a much higher and bigger building is imaged on a large area in the image plane [8]. Two constraint conditions are added into the traditional transformation model (Equation (1)) to increase the accuracy, reliability and robustness of the image distortion correction. For the example shown in Figure 1, the image distortion in Region 1 is different from that of Regions 2 and 3. If a conventional transformation model (e.g., photogrammetry backward intersection (PBI)) is applied to register such a 2D image that contains various and dissymmetric distortions with a 3D object represented by a rigid model, errors in the 2D-3D registration are undoubtedly unavoidable. This problem is especially profound for high-rise buildings because the bottom and top of a high-rise building are affected by different distortions in the same image. To avoid this problem, this paper presents a novel algorithm using two types of geometric constraints, i.e., co-planar and perpendicular conditions, to improve the accuracy of the 2D-3D registration. The two conditions added in the proposed transformation model, in relation to the original rigorous transformation model, can be summarized in the following.
(1)
Co-planar condition: The photogrammetric center, a line in the object coordinate system and the corresponding line in the image coordinate system are in the same plane.
(2)
Perpendicular condition: The constraint requires that if two lines are perpendicular in the object coordinate system, their projections are perpendicular in the image coordinate system.
The remainder of this paper is organized as follows: Section 2 briefly overviews related work; Section 3 focuses on the development of a rigorous transformation model; Section 4 and Section 5 evaluate the correctness and accuracy of the proposed method and compare it to other existing methods; Section 6 summarizes the major contributions and advantages of the proposed method.

2. Related Work

2D-3D registration for different applications has been a very active research area in the last few decades. Researchers have developed a wide range of 2D-3D registration methods; papers in the existing literature provide an overall survey and study of various registration methods, such as [9,10,11,12,13,14,15,16,17]. Most of the registration methods consist of four steps as follows:
  • Feature detection: Existing methods can be classified into feature-based or intensity-based methods. Feature-based methods use point features, linear features and areal features extracted from both the 3D urban surface/building model and 2D imagery as the common feature pairs and then align the corresponding feature pairs via a transform model. This method has been widely applied in computer vision [18,19,20,21,22,23] and medical image processing [24,25]. Intensity-based methods utilize the intensity-driven information, such as texture, reflectance, brightness, and shadow, to achieve high-accuracy in 2D-3D registration [11,13,26]. These features, which are called ground control points (GCPs) in remote sensing, can be detected manually or automatically.
  • Feature matching: The relationship between the features detected from the two images is established. Different feature similarity measures are used for the estimation.
  • Transform model estimation: The transform type and its parameters between two images are estimated by certain mapping functions based on the previously-established feature correspondence.
  • Image resampling and transformation: The slave image is resampled and transformed into the frame of the reference image, according to the transformation model and interpolation technique.
Among the previous efforts, imaging geometry has been utilized as a rigorous transformation model by some researchers, such as [26,27,28,29,30,31,32] suggested integrating single straight lines into the orientation of images for building reconstruction and/or building recognition. Habib et al. [33] proposed that straight lines be utilized as constraints in photogrammetric bundle adjustment to solve the interior orientation parameters (IOPs) of the camera and the exterior orientation parameters (EOPs) of aerial images. Two major advantages lie in that: (1) straight lines are employed to enhance the reliability of bundle adjustment; and (2) the IOPs are considered as unknown parameters to be solved in the model. The disadvantages of this method lie in that: (1) the straight line constraint is not as strong as that of co-planarity, because this type of constraint does not combine the image space, object space and perspective center; (2) the straight line constraint enhances only the accuracy along a straight line in theory, whereas the accuracy along other directions, such as that perpendicular to the straight line, becomes relatively weak. Zhang et al. [34] developed a methodology to co-register the laser scanning data and aerial images over urban areas. The method considers a linear feature as observed primitives and uses coplanar condition as error equations to solve for the EOP of the digital image. One advantage of this method is that it extends the straight line constraint into the coplanar constraint, which in fact combines features in object space, image space and the perspective center. One disadvantage is that it considers digital cameras as a rigid body for which the IOPs are known and are computed independently.
Considering the advantages and disadvantages of the two foregoing methods (i.e., [33,34]), this paper presents two types of constraint conditions, i.e., coplanar and perpendicular conditions. This idea comes from the fact that Zhou et al. [8] demonstrated that two aerial images over an urban area will have different accuracy of registration even with the same digital building model (DBM) and the same algorithm. This implies that highly accurate 2D-3D registration will not be achievable if constraint conditions are not applied.

3. Rigorous Transformation Model

We first briefly describe the fundamental concepts of the conventional rigorous transformation model, which is based on imaging geometry. For more detailed information, please refer to existing literature [27,28,29]. This paper only considers aerial photography. The conventional rigorous transformation model is represented by:
x g x 0 = f r 11 ( X G X s ) + r 12 ( Y G Y s ) + r 13 ( Z G Z s ) r 31 ( X G X s ) + r 32 ( Y G Y s ) + r 33 ( Z G Z s ) = f x y g y 0 = f r 21 ( X G X s ) + r 22 ( Y G Y s ) + r 23 ( Z G Z s ) r 31 ( X G X s ) + r 32 ( Y G Y s ) + r 33 ( Z G Z s ) = f y
where f, x0, y0 are the camera’s IOP and ri,j (i = 1, 2, 3; j = 1, 2, 3) are components of the rotation matrix R(ω, ϕ, k), which was described in Equation (2). Considering two types of imagery distortions, radial and decentering, the image coordinates in Equation (3) can be modified by:
x g = x g + δ x x 0 y g = y g + δ y y 0
where (xg, yg) are the distortion-free image coordinates after correction and δx, δy are corrections through the imagery distortion models below:
δ x = κ 1 x g ( x g 2 + y g 2 ) r a d i a l + ρ 1 ( 3 x g 2 + y g 2 ) + 2 ρ 2 x g y g d e c e n t e r i n g δ y = κ 1 y g ( x g 2 + y g 2 ) r a d i a l + ρ 2 ( 3 x g 2 + y g 2 ) + 2 ρ 1 x g y g d e c e n t e r i n g
where k1, ρ1 and ρ2 are the coefficients of radial distortion and decentering, respectively, and xg, yg are the coordinates of a target point in a defined image coordinate system.
Because Equation (3) is not linear, it must be linearized with respect to both IOPs and EOPs. The vector form of the registration model can be expressed by:
V = A 1 X 1 + A 2 X 2 L
where X1 represents EOPs; X2 denotes IOPs; V is the residual error matrix; and A1 and A2 are their coefficients, whose partial derivatives with respect to the registration parameters can be found in [35]. The vectors in Equation (6) are X1 = (ϕ, ω, k, XS, YS, ZS)T, X1 = [k1, ρ1, ρ2], V = ( V x g , V y g ) ,
A 1 = [ f x X s f x Y s f x Z s f x ϕ f x ω f x κ f y X s f y Y s f y Z s f y ϕ f y ω f y κ ] ,   A 2 = [ f x κ 1 f x ρ 1 f x ρ 2 f y κ 1 f y ρ 1 f y ρ 2 ]
L = ( L x g L y g ) = [ x g f r 11 ( X G X s ) + r 12 ( Y G Y s ) + r 13 ( Z G Z s ) r 31 ( X G X s ) + r 32 ( Y G Y s ) + r 33 ( Z G Z s ) y g f r 21 ( X G X s ) + r 22 ( Y G Y s ) + r 23 ( Z G Z s ) r 31 ( X G X s ) + r 32 ( Y G Y s ) + r 33 ( Z G Z s ) ]
The registration parameters in Equation (6) are conventionally solved by least adjustment estimation using a number of well-distributed GCPs.
As mentioned previously, the conventional registration model is not robust for high-rise buildings in urban areas. Therefore, this paper proposes two types of conditions, co-planarity and perpendicularity, as constraints to the conventional model Equation (6). Mathematically, the two types of conditions are described as follows.

3.1. Coplanar Condition

The straight line condition does not combine the features in image space, object space and perspective center; thus, this paper proposes a coplanar constraint condition, which is similar to (Zhang et al., 2005). As shown in Figure 2, given two points P and Q on the object surface, their corresponding image points are p and q. Five points, O (camera exposure center), P, Q, p and q, should theoretically be coplanar. The mathematical model that describes the coplanar condition using three vectors O P , O Q and O p is:
| u p v p w p X P X S Y P Y S Z P Z S X Q X S Y Q Y S Z Q Z S | = F p = 0
where FP is the residual error matrix; (XP, YP, ZP) and (XQ, YQ, ZQ) are coordinates of the ground points P, Q with respect to the ground coordinate system, respectively; (uP, vP, wP) are coordinates of the image point p with respect to the camera coordinate system.
Combining Equations (1) and (3), Equation (7) yields
| u p v p w p X P X S Y P Y S Z P Z S X Q X S Y Q Y S Z Q Z S | = | u p v p w p X P Y P Z P X Q Y Q Z Q | = F p = 0
where:
  • ( X P , Y P , Z P ) = ( X P X S , Y P Y S , Z P Z S ) ,   ( X Q , Y Q , Z Q ) = ( X Q X S , Y Q Y S , Z Q Z S ) .
Let:
  • A = Y P Z Q Y Q Z P ,   B = Z P X Q Z Q X P ,   C = X P Y Q X Q Y P ;
Equation (7) can be rewritten as:
F P = A u p + B v p + C w p
With the established coplanar constraint Equation (9), considering differential terms with respect to the unknown parameters when linearizing using Taylor’s series, Equation (9) can also be rewritten using a differential form as,
| u p + d u P v p + d v P w p + d w P X P + d X P d X S Y P + d Y P d Y S Z P + d Z P d Z S X Q + d X Q d X S Y Q + d Y Q d Y S Z Q + d Z Q d Z S | = F p = 0
where d ( . ) = F p ( . ) , and Equation (10) is rewritten as:
A 1 P d ϕ + A 2 P d ω + A 3 P d κ + A 4 P d X S + A 5 P d Y S + A 6 P d Z S + F P = 0
where:
  • A 1 p = A [ r 31 x P r 32 y P + r 33 f ] + C [ r 11 x p + r 12 y p r 13 f ]
  • A 2 p = A [ ( sin ϕ cos ω sin κ ) x p + ( sin ϕ cos ω cos κ ) y p f sin ϕ sin ω ] + B [ ( sin ϕ sin κ ) x p + ( sin ω cos κ ) y p + f cos ω ] + C [ ( cos ϕ cos ω sin κ ) x p + ( cos ϕ cos ω cos κ ) y p + f cos ϕ sin ω ]
  • A 3 p = A [ r 12 x p r 11 y p ] + B [ ( cos ω cos κ ) x p + ( cos ω sin κ ) y p ] + C [ r 32 x p r 31 y p ]
  • A 4 p = [ ( Y Q Y P ) w p + ( Z Q Z P ) v p ] ,   A 5 p = [ ( X Q X P ) w p + ( Z Q Z P ) u p ]
  • A 6 p = [ ( X Q X P ) v p + ( Y Q Y P ) u p ] ,   ( x p , y p )
are image coordinates of point p on the image plane after all corrections in Section 3.1; other symbols are the same as above.
Equation (11) is rewritten in vector form as below:
C 1 X 1 + W 1 = 0
where:
  • C 1 = [ A 1 P A 2 P A 3 P A 4 P A 5 P A 6 P ] ,   X 1 = [ d ω d φ d k d X S d Y S d Z s ] ,
  • W 1 = F P .

3.2. Perpendicular Condition

The perpendicularity constraint in 3D space is added to ensure preservation of the registered 2D and 3D pairs after applying registration transformation. As shown in Figure 3, suppose that AB and BC are two edges of a flat house roof, and their corresponding edges in the 2D image plane are ab and bc. Suppose that the coordinates of A, B and C are (XA, YA, ZA), (XB, YB, ZB) and (XC, YC, ZC), respectively, in the object coordinate system. For a flat-roof, cube house, the heights (i.e., Z coordinates) of A, B and C are the same, and the line segments AB and BC are perpendicular to each other. If AB is not perpendicular to BC and the angle between them is θ, i.e., B deviates from its correct position at B′, a line from C to O is drawn, and CO is perpendicular to AB′. l is the distance between B and O and can be expressed by [8]:
l = ( X C X B ) cos θ + ( Y C Y B ) sin θ = ( Y C Y B ) ( Y B Y A ) / S A B + ( X C X B ) ( X B X A ) / S A B
where SAB is the length of the segment AB. Theoretically, the distance l should be zero. Thus, the differential form of Equation (13) is:
l 0 + Δ l = [ ( Y C Y B ) ( Y B Y A ) / S A B + ( X C X B ) ( X B X A ) / S A B ] + [ ( X B X A ) Δ X C + ( Y B Y A ) Δ Y C + ( X B X C ) Δ X A + ( Y B Y C ) Δ Y A + ( X C 2 X B + X A ) Δ X B + ( Y C 2 Y B + Y A ) Δ Y B ] / S A B = 0.
Equation (14) can be rewritten in matrix form as below:
[ X B X C Y B Y C X C 2 X B + X A Y C 2 Y B + Y A X B X A Y B Y A ] T [ Δ X A Δ Y A Δ X B Δ Y B Δ X C Δ Y C ] + [ ( X C X B ) ( X B X A ) + ( Y C Y B ) ( Y B Y A ) ] = 0
Equation (15) can be then rewritten in vector form as:
C 2 X ^ 2 + W 2 = 0
where:
  • C 2 = [ ( X B X C ) ( Y B Y C ) ( X C 2 X B + X A ) ( Y C 2 Y B + Y A ) ( X B X A ) ( Y B Y A ) ] ,
  • X ^ 2 = [ Δ X A Δ Y A Δ X B Δ Y B Δ X C Δ Y C ] T ,
  • W 2 = ( X C X B ) ( X B X A ) + ( Y C Y B ) ( Y B Y A ) .

3.3. Solution of Registration Parameters

By combining Equations (6), (12) and (16), we have:
{ V = A 1 X 1 + A 2 X 2 L C 1 X ^ 1 + W 1 = 0 C 2 X ^ 2 + W 2 = 0
Furthermore, Equation (17) can be rewritten as:
{ V = D δ X L C x δ X + W x = 0
where:
  • C x = ( C 1 C 2 ) T ,   W x = ( W 1 W 2 ) T ,   D = ( A 1 A 2 ) ,
  • δ X = ( Δ ϕ   Δ ω   Δ κ   Δ X S   Δ Y S   Δ Z S Δ X A   Δ Y A   Δ Z A     Δ X N   Δ Y N   Δ Z N   ) T
Equation (18) is the model derived in this paper for 2D-3D registration. Distinct from the existent conventional model used for 2D-3D registration in Equation (7), this model is called the rigorous model in this paper. As can be seen in Equation (18), this registration model combines the linear edges and conventional point features of a building. This model should therefore achieve higher accuracy and reliability than the conventional model. Equation (18) is usually solved using least-squares estimation, which is expressed as [36]:
Φ = V T V + 2 K S T ( C X δ X + W X ) = min
With least-squares estimation, the solution normal equation matrix can be written as:
{ D X T D X δ X + C X T K S + D X L = 0 C X δ X + 0 K S + W X = 0
where Ks is an introduced unknown matrix (for its details, please refer to [37]). If the number of total observation equations is M, the dimension of Ks is M × 1, i.e., K S = ( K S 1 , K S 2 , , K S M ) T .
Let N X X = D x T D x ; Equation (20) can be rewritten as:
( N X X C X T C X 0 ) ( δ X K S ) + ( D X T L W X ) = 0
The solution of unknown parameters would be:
{ δ X = ( Q 12 D T L + Q 12 W X ) K S = ( Q 21 D T L + Q 22 W X )
where Qij (i, j = 1, 2) is the components of the covariance matrix, which is an inverse of the normal matrix, i.e.,
( N X X C X T C X T 0 ) 1 = ( Q 11 Q 12 Q 21 Q 22 )

3.4. Accuracy Evaluation

The standard deviations of the unit weight and unknown parameters are typically used to evaluate the quality of adjustment and the accuracy of adjusted unknown parameters. The standard deviation of the unit weight δ0 is computed by:
δ 0 = V T V r
where V is the residual vector and r is the redundancy of the observation equation.

3.5. Discussion

3.5.1. Coplanar and Collinear Constraint

Lines have been used as registration primitives in previous work. As shown in Figure 4, if any three points B′, E′, C′ in object space are collinear, the collinear constraint condition can be established according to the method proposed by [8,33]. We can analyze the constraint condition of lines.
  • Under the collinear constraint, three points B′, E′, C′ are considered to lie on a line in object space. However, due to the incomplete camera calibration in practice, their projections b′, e′, c′ cannot be guaranteed to lie on a line in image space. This implies that the collinear constraint condition ensures their collinearity only in object space, but it does not ensure their collinearity in image space.
  • If the image is completely free of distortion after processing (this is definitely impossible.), their projections b′, e′, c′ in image space should lie on a line; i.e., their projections also meet the collinear constraint condition in image space. Under this ideal condition, the collinear constraint is consistent with the coplanar constraint. In other words, the collinearity and coplanar constraint condition is highly correlated if they are simultaneously considered in a formula.
  • Because the camera lens cannot be completely calibrated, the residue of image distortion still remains even though a very high accuracy of camera calibration is achieved. This fact results in a line in object space being distortion free, but its corresponding image still contains distortion in image space. That is, the coplanar constraint combines a line in object space with its corresponding one in image space, whereas the collinear constraint considers only a line in object space. Thus, the coplanar constraint condition is much stricter than the collinear constraint condition.
Given the foregoing analysis, this paper uses the coplanar condition instead of the collinear condition.

3.5.2. Perpendicular Constraint

Perspective projections do not preserve angles. For example, as shown in Figure 4, if θ = 90° in object space, β is not 90° after perspective projection. Furthermore, when θ = θ′ = 90°, the two angles’ projections are both equal to β. This means that: (1) the perpendicular constraint is capable of enhancing the registration accuracy; (2) it may result in multiple solutions of IOPs and EOPs if we consider only the perpendicular constraint condition without adding the coplanar constraint and/or GCPs. For this reason, this paper simultaneously considers the perpendicular constraint, coplanar constraint and a few GCPs. Therefore, the perpendicular constraint proposed in this paper not only enhances the registration accuracy, but will also not reduce the risk of multiple solutions of IOPs and EOPs.
In a word, the coplanarity constraint condition in Figure 2 improves the accuracy of co-registration along a line (i.e., building edge), and the perpendicular constraint condition in Figure 3 improves the accuracy of co-registration along perpendicular lines (i.e., two perpendicular building edge). The two types of constraint conditions can maximize the accuracy symmetry without the need of adding additional GCPs.

4. Experiments and Analysis

4.1. Experimental Data

The experimental field is located in downtown Denver, Colorado, where the highest buildings reach 125 m, and many buildings are approximately 100 m. The City and County of Denver, Colorado, provided us with the aerial images and digital surface model (DSM).
  • Aerial images: The six aerial images, with two flight strips, at the end lap of approximately 65% and the side lap of 30%, were acquired using a Leica RC30 aerial camera at a nominal focal length of 153.022 mm on 17 April 2000 (see Figure 5). The flying height was 1650 m above the mean ground elevation of the imaged area. The aerial photos were originally recorded on film and later scanned into digital format at a pixel resolution of 25 μm.
  • DSM dataset: The DSM in the central part of downtown Denver was originally acquired by LiDAR and then edited into grid format at a spacing of 0.6 m (Figure 6a). The DSM accuracy in planimetry and height was approximately 0.1 m and 0.2 m, respectively. The horizontal datum is Geodetic Reference System (GRS) 1980, and the vertical datum is North American Datum of 1983 (NAD83).
With the datasets provided, we conducted data pre-processing and 3D coordinate measurement.
  • CSG representation: With the given DSM (see Figure 6a), a 3D CSG (constructive solid geometry) model was utilized to represent the building individually. The CSG is a very mature method that uses a few basic primitives to create a complex surface or object by Boolean operators [38]. The details of the CSG method can be found in [6,38]. This paper proposes a three-level data structure to represent a building based on the given DSM data. The first level is 2D primitive representation, such as rectangle, circle, triangle or polygon, accompanied with their own parameters, including length, width, radius and the position of the point. The second level is 3D primitive representation, which is created by adding height information onto the 2D primitives. For example, by adding height information onto a rectangle (2D primitive), it becomes a cube (3D primitive); by adding height information onto a circle (2D primitive), it becomes a cylinder (3D primitive). In other words, cubes, cylinders, cones, pyramids, etc., are single 3D primitives. The third level is to create a 3D CSG representation of a building by combining multiple 3D primitives. During this process, a topological relationship is created, as well, so that building data can be easily retrieved and the geometric constraints proposed in this paper can be easily constructed. Meanwhile, the attributes (e.g., length, position, direction, grey) are assigned to each building (see Figure 6b).
  • Coordinate measurement of “GCPs” and checkpoints: In areas with numerous high buildings, it is impossible to find conventional photogrammetric targeted points as ground control points (GCPs). In this paper, the corner points of building roofs or bottoms are taken as “GCPs”. The 3D (XYZ) coordinates of “GCPs” are acquired from the DBM. The corresponding 2D image coordinates are automatically measured with Erdas/Imagine software. The first three “GCPs” are selected manually. First, we select their locations in DBM, and their 3D coordinates can be read directly. We then find the corresponding locations in the 2D images manually with Erdas/Imagine software. After that, all other “GCPs” can be automatically measured with Erdas/Imagine software. The 2D image coordinates of “GCP” selection and measurements are based on back-projection. The EOPs of each image are solved by the first three measured “GCPs”, and the 3D coordinates of the other corners of buildings are then back-projected onto the image plane, with which the 2D image coordinates are automatically measured (Figure 7a). The measurement accuracy of the 2D image coordinates is at the sub-pixel level. With the operation above, 321 points are measured, of which 232 points are taken as “GCPs” and 89 are used as checkpoints, which will be used to evaluate the final accuracy of 2D-3D registration (Figure 6a and Figure 7b).

4.2. Establishment of the Registration Model

With the measured “GCPs” in Section 4.1, the 2D-3D registration model in Equation (18) can be established. Meanwhile, the two types of geometric constraint conditions, co-planarity (Equation (12)) and perpendicularity (Equation (16)), can simultaneously be established using the measured “GCPs”. For example, the faces, Fa, Fb and Fc, the edges, L1, L2 and L3, and their attributes are described in the CSG model (see Figure 8a,b). From the attributes of the face data structure, L1 and L3 are automatically recognized as the edges of roofs, and L2 is seen as the edge of a wall. However, the control points P1, P3 and P5 are extracted, and they can be automatically constructed lines, l1, l2 and l3. When L1, L2 and L3 are back-projected onto the original aerial image, L1, L2 and L3 will be matched with lines l1, l2 and l3, respectively (Figure 8b). Thus, the attributes (e.g., edges of roofs and edges of walls) and topographic relationships (i.e., perpendicularity) of l1, l2 and l3 can be inherited from L1, L2 and L3. Thus, l1 and l3 are edges in the roof, and l2 is a vertical line in the wall. Additionally, l2 is perpendicular to l1 and l3; l1 is also perpendicular to l3. Therefore, when l1l3, Equation (16) is applied to construct a perpendicular constraint condition by replacing A, B and C by P5, P3 and P1 (see Figure 2). Similarly, the coplanar constraint condition (Equation (12)) can be established, as well. By combining all established equations above, the registration equations for Equation (18) are established.
With the registration model established above, the solution can be found using the least-squares estimation. Owing to the non-linearization of the registration model (Equation (18)), the iterative solution is needed until the convergence of the solved registration parameters is met under a given threshold. The solved results are listed in Section 5.

4.3. Experimental Results

With the solved registration parameters, we project the 3D urban model back onto the aerial image. The 106 3D CSG building models are registered onto the original 2D aerial image, as shown in Figure 9. It is noted that the 3D CSG model is in the form of transparent wireframes to avoid the impact of the occluded wireframes of buildings when verifying the accuracy of 2D-3D registration.

5. Accuracy Comparison and Analysis

The accuracy comparison analysis of the 2D-3D registration is designed through conventional registration models (i.e., Equation (1)) and three types of ground control primitives. The evaluation indices include: (1) theoretical accuracy of the solved registration parameters; (2) relative accuracy in 2D space and 3D space; and (3) visual check.

5.1. Comparison of Theoretical Accuracy

The IOPs, including focal length and principal point coordinates, were precisely calibrated and provided by the data vendor. The registration parameters (also EOPs, XS, YS, ZS, ϕ, ω and k) and image distortion parameters (k1, ρ1 and ρ2) are solved using Equation (22). The results and standard deviation are listed in Table 1 and Table 2. As can be seen in Table 1 and Table 2, the proposed model has the highest accuracy at a standard deviation of 0.47 pixels for solving IOPs and 0.37 pixels for solving EOPs.

5.2. Accuracy Comparison in 2D Space

The accuracy comparison in the 2D image plane is conducted to evaluate the correctness and robustness of the proposed method. The steps of the comparison analysis in 2D space are as follows: (1) With the registration parameters solved above, 3D buildings are back-projected onto the aerial image plane; (2) XY image coordinates of the back-projected and original 59 checkpoints are measured; (3) the root mean square ( R M S ) error of 59 checkpoints is computed by R M S X = ( Δ T Δ ) / N , where Δ is the difference in the x coordinates between the back-projected and the original 59 checkpoints and N is the number of checkpoints used; the operation for the y coordinates is similar.
As illustrated in Table 1, the accuracy estimation for the lens distortion parameters ( δ 0 I O P ) is 1.23 pixels and 1.09 pixels when using 12 GCPs and 211 GCPs with conventional model. Compared with those results, δ 0 I O P is 0.47 pixel when using 32 points with the proposed model, which is an improvement of 61.8% over the conventional model when using 12 GCPs and 56.9% when using 211 GCPs.
On the similarity, from Table 2, the accuracy estimation for the exterior orientation parameters ( δ 0 I O P ) is 1.26 and 1.02 when using 12 “GCPs” and 211 “GCPs” with the conventional model. Compared to those results, δ 0 I O P is 0.37 pixels when using 32 points with the proposed model, an improvement of 70.6% over the conventional model when using 12 GCPs and 63.7% when using 211 GCPs.
From Table 3, there are significant offsets between the building wireframes and the building edges for the conventional model, whose RMS values are approximately 10 pixels along the x and y directions; whereas the RMS of the proposed method is only about two pixels in both the x and y directions, an improvement of 78.0% over the conventional model when using 12 GCPs and 67.2% when using 211 GCPs. The results demonstrate that the accuracy of 2D-3D registration achieved by the proposed method has been greatly increased.
Figure 10 depicts the 2D-3D registration results for the visual check using the proposed and conventional model. Five typical buildings with different heights are represented, and they are located at different locations of the image plane. a1 through e1 are the 2D-3D registration results created by the proposed model, in which 32 points plus the two types of constraints are employed; a2 through e2 and a3 through e3 are the 2D-3D registration results created by conventional models, in which 211 “GCPs” and 12 “GCPs” are employed, respectively.

6. Conclusions

The major contribution of this paper is the development of a rigorous transformation model for 2D-3D registration to increase its accuracy and reliability. This model simultaneously considers two types of geometric constraints (perpendicularity and co-planarity) and utilizes both point features and linear features as registration primitives. The details of the proposed 2D-3D registration model were presented.
A test field located in downtown Denver, Colorado, has been used to evaluate the proposed methods. The comparison analysis of the accuracy achieved by the proposed method and the conventional method are conducted. The experimental results demonstrated that: (1) the theoretical accuracy of the solved registration parameters can reach 0.47 pixels, whereas other methods reach only 1.23 and 1.09 pixels; (2) the RMS of 2D-3D registration created by the conventional model is approximately 10 pixels along both x and y directions, whereas the RMS from the proposed method is only two pixels in both the x and y directions. In addition, five buildings with different heights located at different locations of the aerial image plane are visually evaluated. It is demonstrated that the method proposed in this paper achieved higher accuracy than conventional methods, and 3D buildings presented by wireframes can be exactly registered in 2D aerial imagery.

Acknowledgments

The author sincerely thanks Wolfgang Schickler for his help on the delivery of aerial images, building data, DSM and metadata. The author also thanks the project administrators of the City and County of Denver, Colorado, for granting permission to use their data. This paper is supported by the USA National Science Foundation under Project Number NSF 0131893 and by NSFC under Grant Numbers 41431179, 41162011 and 41271395, the GuangXi Governor Grant under Approval Number 2010-169 and GuangXi NSF under Contract Numbers 2015GXNSFDA139032 and 2012 GXNSFCB053005.

Author Contributions

G.Z. contributes most to this manuscript. He has made the main contribution to the whole idea and writing the initial draft for this manuscript. Q.L. contributed to re-organization of the framework and the multiple reviewing of the manuscript. W.X. supported data processing and analysis. T.Y. and J.H offered assistant help for the final publication and Y.S. contributed to editing and reviewing the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Clarkson, M.J.; Rueckert, D.; Hill, D.L.; Hawkes, D.J. Using photo-consistency to register 2D optical images of the human face to a 3D surface model. IEEE Trans. Pattern Anal. Mach. Intell. 2001, 23, 1266–1280. [Google Scholar] [CrossRef]
  2. Liu, L.; Stamos, I. A systematic approach for 2D-image to 3D-range registration in urban environments. Comput. Vis. Image Underst. 2012, 116, 25–37. [Google Scholar] [CrossRef]
  3. Zhou, G.; Song, C.; Schickler, W. Urban 3D GIS from LIDAR and aerial image data. Comput. Geosci. 2004, 30, 345–353. [Google Scholar] [CrossRef]
  4. Christian, F.; Zakhor, A. An automated method for large-scale, ground-based city model acquisition. Int. J. Comput. Vis. 2004, 60, 5–24. [Google Scholar]
  5. Poullis, C.; You, S. Photorealistic large-scale urban city model reconstruction. IEEE Trans. Vis. Comput. Graph. 2009, 15, 654–669. [Google Scholar] [CrossRef] [PubMed]
  6. Zhou, G.; Chen, W.; Kelmelis, J. A Comprehensive Study on Urban True Orthorectification. IEEE Trans. Geosci. Remote Sens. 2005, 43, 2138–2147. [Google Scholar] [CrossRef]
  7. Zhou, G.; Uzi, E.; Feng, W.; Yuan, B. CCD camera calibration based on natural landmark. Pattern Recognit. 1998, 31, 1715–1724. [Google Scholar] [CrossRef]
  8. Zhou, G.; Xie, W.; Cheng, P. Orthoimage creation of extremely high buildings. IEEE Trans. Geosci. Remote Sens. 2008, 46, 4132–4141. [Google Scholar] [CrossRef]
  9. Brown, G. A survey of image registration techniques. ACM Comput. Surv. 1992, 24, 325–376. [Google Scholar] [CrossRef]
  10. Besl, J.; McKay, N. A method for registration of 3D shapes. IEEE Trans. Pattern Anal. Mach. Intell. 1992, 14, 239–256. [Google Scholar] [CrossRef]
  11. Maintz, B.; van den Elsen, P.; Viergever, M. 3D multimodality medical image registration using morphological tools. Image Vis. Comput. 2001, 19, 53–62. [Google Scholar] [CrossRef]
  12. Wang, S.; Tseng, Y.-H. Image orientation by fitting line segments to edge pixels. In Proceedings of the Asian Conference on Remote Sensing (ACRS), Kathmandu, Nepal, 25–29 November 2002.
  13. Zitová, B.; Flusser, J. Image registration methods: A survey. Image Vis. Comput. 2003, 21, 977–1000. [Google Scholar] [CrossRef]
  14. Salvi, J.; Matabosch, C.; Fofi, D.; Forest, J. A review of recent range image registration methods with accuracy evaluation. Image Vis. Comput. 2007, 25, 578–596. [Google Scholar] [CrossRef]
  15. Zhou, G.Q. Geo-referencing of video flow from small low-cost civilian UAV. IEEE Trans. Autom. Eng. Sci. 2010, 7, 156–166. [Google Scholar] [CrossRef]
  16. Santamaría, J.; Cordón, O.; Damas, S. A comparative study of state-of-the-art evolutionary image registration methods for 3D modeling. Comput. Vis. Image Underst. 2011, 115, 1340–1354. [Google Scholar] [CrossRef]
  17. Markelj, P.; Tomaževič, D.; Likar, B.; Pernuš, F. A review of 3D/2D registration methods for image-guided interventions. Med. Image Anal. 2012, 16, 642–661. [Google Scholar] [CrossRef] [PubMed]
  18. Bican, J.; Flusser, J. 3D rigid registration by cylindrical phase correlation method. Pattern Recognit. Lett. 2009, 30, 914–921. [Google Scholar] [CrossRef]
  19. Cordón, O.; Damas, S.; Santamaría, J. Feature-based image registration by means of the CHC evolutionary algorithm. Image Vis. Comput. 2006, 22, 525–533. [Google Scholar] [CrossRef]
  20. Chung, J.; Deligianni, F.; Hu, X.; Yang, G. Extraction of visual features with eye tracking for saliency driven 2D/3D registration. Image Vis. Comput. 2005, 23, 999–1008. [Google Scholar] [CrossRef]
  21. Fitzgibbon, W. Robust registration of 2D and 3D point sets. Image Vis. Comput. 2003, 21, 1145–1153. [Google Scholar] [CrossRef]
  22. Ding, L.; Goshtasby, A.; Satter, M. Volume image registration by template matching. Image Vis. Comput. 2001, 19, 821–832. [Google Scholar] [CrossRef]
  23. Hsu, L.Y.; Loew, M.H. Fully automatic 3D feature-based registration of multi-modality medical images. Image Vis. Comput. 2001, 19, 75–85. [Google Scholar] [CrossRef]
  24. Benameur, S.; Mignotte, M.; Parent, S.; Labelle, H.; Skalli, W.; de Guise, J. 3D/2D registration and segmentation of sclerotic vertebrae using statistical models. Comput. Med. Imaging Graph. 2003, 27, 321–337. [Google Scholar] [CrossRef]
  25. Gendrin, C.; Furtado, H.; Weber, C.; Bloch, C.; Figl, M.; Pawiro, S.A.; Bergmann, H.; Stock, M.; Fichtinger, G.; Georg, D. Monitoring tumor motion by real time 2D/3D registration during radiotherapy. Radiother. Oncol. 2012, 102, 274–280. [Google Scholar] [CrossRef] [PubMed]
  26. Zikic, D.; Glocker, B.; Kutter, O.; Groher, M.; Komodakis, N.; Kamen, A.; Paragios, N.; Navab, N. Linear intensity-based image registration by Markov random fields and discrete optimization. Med. Image Anal. 2010, 14, 550–562. [Google Scholar] [CrossRef] [PubMed]
  27. Huseby, B.; Halck, O.; Solberg, R. A model-based approach for geometrical correction of optical satellite images. Int. Geosci. Remote Sens. Symp. 1999, 1, 330–332. [Google Scholar]
  28. Shin, D.; Pollard, J.K.; Muller, J.P. Accurate geometric correction of ATSR images. IEEE Trans. Geosci. Remote Sens. 1997, 35, 997–1006. [Google Scholar] [CrossRef]
  29. Thepaut, O.; Kpalma, K.; Ronsin, J. Automatic registration of ERS and SPOT multisensor images in a data fusion context. For. Ecol. Manag. 2000, 128, 93–100. [Google Scholar] [CrossRef]
  30. Horaud, R. New methods for matching 3-D objects with single perspective view. IEEE Trans. Pattern Anal. Mach. Intell. 1987, 9, 401–412. [Google Scholar] [CrossRef] [PubMed]
  31. Suetens, P.; Fua, P.; Hanson, A.J. Computational strategies for object recognition. ACM Comput. Surv. 1992, 24, 5–61. [Google Scholar]
  32. Lafarge, F.; Descombes, X.; Zerubia, J.; Deseilligny, M. Structural approach for building reconstruction from a single DSM. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32, 135–147. [Google Scholar] [PubMed]
  33. Habib, A.; Morgan, M.; Lee, Y. Bundle adjustment with self-calibration using straight lines. Photogramm. Rec. 2002, 17, 635–650. [Google Scholar]
  34. Zhang, Z.; Deng, F.; Zhang, J.; Zhang, Y. Automatic registration between images and laser range data. In Processing of the SPIE MIPPR 2005: SAR and Multispectral Image Processing, Wuhan, China, 31 October–2 November 2005; Zhang, L., Zhang, J., Eds.; SPIE: Wuhan, China, 2005. [Google Scholar]
  35. Wolf, P.; Dewitt, B. Elements of Photogrammetry, 2nd ed.; McGraw-Hill Companies: New York, NY, USA, 2000. [Google Scholar]
  36. Triggs, B.; McLauchlan, P.; Hartley, R.; Fitzgibbon, A. Bundle adjustment—A modern synthesis. In Vision Algorithms: Theory and Practice; Triggs, B., Zisserman, A., Szeliski, R., Eds.; Springer-Verlag LNCS: Heidelberg, Germany, 1883. [Google Scholar]
  37. Lawson, C.L.; Hanson, R.J. Solving Least Squares Problems; Society for Industrial and Applied Mathematics: New Providence, NJ, USA, 1987. [Google Scholar]
  38. James, F. Constructive Solid Geometry, Computer Graphics: Principles and Practice; Addison-Wesley Professional: Upper Saddle River, NJ, USA, 1996; pp. 557–558. [Google Scholar]
Figure 1. Error of the 2D-3D registration caused by both various and dissymmetric image distortions, displacement and a rigid 3D object model.
Figure 1. Error of the 2D-3D registration caused by both various and dissymmetric image distortions, displacement and a rigid 3D object model.
Remotesensing 08 00507 g001
Figure 2. The geometry for the coplanar condition.
Figure 2. The geometry for the coplanar condition.
Remotesensing 08 00507 g002
Figure 3. The geometry of the perpendicular condition.
Figure 3. The geometry of the perpendicular condition.
Remotesensing 08 00507 g003
Figure 4. Discussion of the relationship between the straight line and the coplanar constraint and the relationship between the perpendicular constraint and the angle-unpreserved perspective projection.
Figure 4. Discussion of the relationship between the straight line and the coplanar constraint and the relationship between the perpendicular constraint and the angle-unpreserved perspective projection.
Remotesensing 08 00507 g004
Figure 5. Six aerial images from two strips in the study area, City of Denver, Colorado [6].
Figure 5. Six aerial images from two strips in the study area, City of Denver, Colorado [6].
Remotesensing 08 00507 g005
Figure 6. (a) 2D brightness is applied to represent the digital building model (DBM); and (b) constructive solid geometry (CSG) is used to represent the DBM [6].
Figure 6. (a) 2D brightness is applied to represent the digital building model (DBM); and (b) constructive solid geometry (CSG) is used to represent the DBM [6].
Remotesensing 08 00507 g006
Figure 7. (a) The corners of buildings in the 3D model, which are taken as “GCPs” and (b) the corresponding 2D image coordinates in the 2D image plane.
Figure 7. (a) The corners of buildings in the 3D model, which are taken as “GCPs” and (b) the corresponding 2D image coordinates in the 2D image plane.
Remotesensing 08 00507 g007
Figure 8. The two types of geometric constraints: (a) the linear features in the building model; and (b) the corresponding linear features in the aerial image.
Figure 8. The two types of geometric constraints: (a) the linear features in the building model; and (b) the corresponding linear features in the aerial image.
Remotesensing 08 00507 g008
Figure 9. The experimental results of 2D-3D registration using the proposed method. (a) The 2D-3D registration results with the proposed method; (b) the sub-area of (a) framed by the black rectangle; (c) the sub-area of (b), which illustrates the detail 2D-3D registration results of a single high-rise building.
Figure 9. The experimental results of 2D-3D registration using the proposed method. (a) The 2D-3D registration results with the proposed method; (b) the sub-area of (a) framed by the black rectangle; (c) the sub-area of (b), which illustrates the detail 2D-3D registration results of a single high-rise building.
Remotesensing 08 00507 g009
Figure 10. Visual comparison of 2D-3D registration using the two models: (a1e1) are the 2D-3D registration results created by the proposed model, in which 32 points plus the two types of constraints are employed; (a2e2) and (a3e3) are the 2D-3D registration results created by conventional models, in which 211 “GCPs” and 12 “GCPs” are employed, respectively.
Figure 10. Visual comparison of 2D-3D registration using the two models: (a1e1) are the 2D-3D registration results created by the proposed model, in which 32 points plus the two types of constraints are employed; (a2e2) and (a3e3) are the 2D-3D registration results created by conventional models, in which 211 “GCPs” and 12 “GCPs” are employed, respectively.
Remotesensing 08 00507 g010
Table 1. Accuracy evaluation of the 2D-3D registration parameter: image distortion parameters (focal length = 153.022 mm; x0 = 0.002 mm; y0 = −0.004 mm).
Table 1. Accuracy evaluation of the 2D-3D registration parameter: image distortion parameters (focal length = 153.022 mm; x0 = 0.002 mm; y0 = −0.004 mm).
MethodsLens Distortion Parameter δ 0 I O P (pixel)
Registration ModelControl Primitivesk1 (×10−4)ρ1 (×10−6)ρ1 (×10−6)
Traditional model12 points0.208−0.1140.1981.23
Traditional model211 points0.301−0.1740.2221.09
Our model32 points + constraints0.707−0.2010.0380.47
Table 2. Accuracy evaluation of the 2D-3D registration parameter: exterior orientation parameters (EOPs).
Table 2. Accuracy evaluation of the 2D-3D registration parameter: exterior orientation parameters (EOPs).
MethodsExterior Orientation Parameter
Registration ModelRegistration Primitivesϕ (arc)ω (arc)k (arc)XS (ft)YS (ft)ZS (ft) δ 0 I O P (pixel)
Traditional model12 points0.01040.0158−1.55033143007.31696340.49032.01.26
Traditional model211 points−0.0025−0.0405−1.55463143041.41696562.69070.71.02
Our model32 points + constraints−0.0017−0.0322−1.55443143041.21696533.49071.80.37
Table 3. Accuracy comparison of 2D-3D registration in 2D space.
Table 3. Accuracy comparison of 2D-3D registration in 2D space.
Registration ModelsControl Primitives R M S X (pixel) R M S y (pixel)
Conventional model12 points9.410.1
Conventional model211 points5.26.7
Proposed model32 points + constraints1.82.2

Share and Cite

MDPI and ACS Style

Zhou, G.; Luo, Q.; Xie, W.; Yue, T.; Huang, J.; Shen, Y. Transformation Model with Constraints for High-Accuracy of 2D-3D Building Registration in Aerial Imagery. Remote Sens. 2016, 8, 507. https://doi.org/10.3390/rs8060507

AMA Style

Zhou G, Luo Q, Xie W, Yue T, Huang J, Shen Y. Transformation Model with Constraints for High-Accuracy of 2D-3D Building Registration in Aerial Imagery. Remote Sensing. 2016; 8(6):507. https://doi.org/10.3390/rs8060507

Chicago/Turabian Style

Zhou, Guoqing, Qingli Luo, Wenhan Xie, Tao Yue, Jingjin Huang, and Yuzhong Shen. 2016. "Transformation Model with Constraints for High-Accuracy of 2D-3D Building Registration in Aerial Imagery" Remote Sensing 8, no. 6: 507. https://doi.org/10.3390/rs8060507

APA Style

Zhou, G., Luo, Q., Xie, W., Yue, T., Huang, J., & Shen, Y. (2016). Transformation Model with Constraints for High-Accuracy of 2D-3D Building Registration in Aerial Imagery. Remote Sensing, 8(6), 507. https://doi.org/10.3390/rs8060507

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop