1. Introduction
A dense 3D reconstruction can be performed using active sensors or image data coming from passive sensors. Laser scanners or structured light systems are commonly used to generate dense point clouds, as these systems can collect large volumes of points in a short time even over textureless surfaces. Other techniques to obtain dense point clouds use images that are collected by passive sensors. Advances in sensor technology and the development of new dense matching algorithms have allowed the generation of dense clouds of points [
1,
2,
3] with passive optical images, which provide high precision, reliability and automation. Significant improvements in the hardware and algorithms, e.g., Structure from Motion (SfM), have made photogrammetry a competitive technology due to its high accuracy, automation and affordable costs. Objects can be successfully imaged and reconstructed using image-based approaches, and 3D modelling can achieve results that are comparable or even better than those achieved from laser scanning.
Image matching is a fundamental task in the dense object reconstruction procedure, and it requires the establishment of correspondences between primitives that are extracted from images. The problem of image matching can be solved by using stereopsis [
4,
5] or multi-image correspondences [
1,
6,
7,
8]. In recent decades, the strategies for dense point correspondences have been focused on multi-view approaches [
9,
10], graphics processing units (GPUs) [
11], per-pixel measurements [
12], global energy minimisation algorithms [
5,
13], and dynamic programming approaches [
14]. A review on high density image matching was presented by Remondino et al. [
15] in which available open-source and commercial software were also assessed for dense point cloud generation.
Several matching techniques, such as feature- and area-based algorithms, have been proposed in the photogrammetric literature. Among them, least squares matching (LSM) is a highly accurate area-based matching technique that is well-known and is used to set up a geometric correspondence of two or more image patches. The basic concept was introduced by Förstner [
16] and later refined by Ackermann [
17] and adapted for image matching by Gruen [
18]. Ackermann [
17] and Gruen [
19] evaluated the use of geometric and radiometric parameters in rectangular and square patches in which an affine transformation was used as a linear model. Furthermore, adaptations of LSM have been tested and assessed in practical usability, e.g., using multi-image adjustment with additional geometric constraints [
20] and approaches for object-oriented matching [
21]. Other studies have investigated the adaptability and performance of the LSM, as performed by Bethmann and Luhmann [
22], who used an approach with a projective transformation in the function model to improve the adaptability.
Regarding the application of LSM for matching refinement, Zhang et al. [
23] presented a scheme for matching keypoints in images acquired by unmanned aerial vehicles (UAVs). An LSM based on pyramids was used as a refinement step to improve the final precision. Debella-Gillo and Kääb [
24] explored LSM applications related to surface displacement and deformation of mass movements. The authors showed that LSM could match the images and strain rates accurately.
Matching algorithms have been developed along with studies on their accuracy, as reported in the first experiments produced by Schewe and Förstner [
25] in industrial applications and later in research using DSM generation [
26,
27,
28,
29,
30]. Xu et al. [
31] conducted studies on the measurement accuracy and efficiency using compensation methods based on digital image correlation (DIC). Other studies have also been conducted to evaluate the accuracy achieved with image-based methods, for example, for deformation measurements [
32,
33], shape measurement [
34] and deflection in cylindrical structure [
35]. Remondino et al. [
15] commented on the difficulties in defining an evaluation element (entire surface, small patches or points) and which procedure to use for accuracy assessment. For example, Seitz et al. [
36] assessed the completeness and accuracy based on the Euclidean distance (which indicates how much of the scene was reconstructed and how close to the ground truth the result was). Bethmann and Luhmann [
22] indicated some problems with the performance of LSM, which could affect its results, such as the texture, template size, geometric distortion between images, quality of the initial values and transformation model for matching. In contrast, improvements achieved with technological advances have sped up photogrammetric tasks and produced results that have better quality and accuracy at an affordable cost, which makes photogrammetry an attractive field for research and widespread application. Sutton et al. [
37] presented a discussion considering recent progress in DIC.
This paper introduces a novel technique to produce a continuous 3D reconstruction of a cylindrical surface based on stereo-pairs. When an image sequence over a cylindrical object is collected by moving a camera in a single direction, image patches of corresponding areas are distorted according to a plane projection of a conic section. These distortions make it unfeasible to perform image matching for large image patches. The problem to be solved is the correction of geometric distortions to allow the application of a continuous matching, thus making it feasible to reconstruct the cylindrical surface. The hypothesis was that it would be possible to model the parallax changes by a geometric transformation function to correct the distortion in such a way that homologous areas could be matched and refined with a modified LSM to achieve sub-pixel accuracy. Consequently, all pixels of a continuous patch of the cylindrical surface could have their 3D coordinates estimated.
Moreover, the current techniques typically require a large set of images to perform a 3D reconstruction. In contrast, this proposed technique can use only a single stereo-pair (two images) to generate a 3D point cloud of a cylindrical surface with high accuracy. The main objective of this study is to present and assess the reconstruction technique using a modified function for the LSM in cylindrical surfaces. The experiments were based on a cylindrical object in the laboratory for conceptual valuation purposes, but the approach can be extended to applications with similar surfaces such as construction pipes, lighting poles, mechanic parts, and other objects. The next sections present the methodological steps, experiments and results assessment, showing that an accuracy of 1/10 mm can be achieved in similar datasets.
2. Methodology
The proposed methodology aims at performing a continuous reconstruction of 3D cylindrical patches with high accuracy. The technique accomplishes a fitting between the image patches extracted from the images collected at different viewpoints using a modified geometric transformation for the LSM.
2.1. The Concept
Figure 1 depicts the image acquisition procedure to understand the concept. The image acquisition is performed by displacing the camera in a line path parallel to the cylindrical axis. In this example, three views of a strip over the cylinder are presented. Due to the cylindrical shape and the viewpoint changes, three geometric shapes that correspond to the strip over the cylinder are produced in the images. As seen in
Figure 1, the strip appears as a horizontal shape (the main axis appears as a straight line) when the image is collected from a normal view (the projecting ray to the strip over the cylinder). If the projecting ray of the strip path is oblique, then the strip has a curved shape.
The camera displacement path parallel to the cylindrical axis assures that the images have approximately the same scale, and differences in the image patches of the object are caused by the depth and viewpoint changes. In this case, to make a 3D reconstruction with an image matching procedure using the frontal image as a reference, it is necessary that one image with a suitable base length has the corresponding pixels geometrically transformed to present similar features between the homologue images, which makes it possible to fit a model, as depicted in
Figure 2.
Given two image patches or regions that refer to the same area over the cylindrical diameter but with different perspective views, parallaxes will occur due to depth and orientation variations. A suitable geometric transformation combined with an image matching procedure can relate the corresponding pixels of the two patches to produce a continuous 3D reconstruction of all pixels within the region.
It is important to note that the presented example was based on a vertical camera displacement parallel to the cylindrical axis. However, if the object is lying in a horizontal plane, then the camera movement should be horizontal and the same effect will be produced. The first step is to determine which function can transform a curved patch (a conic section) to a reference patch, with both representing the same area in the object. The reference patch does not need to be exactly perpendicular to the camera because a 3D reconstruction will be made later. Next, a second step is performed to refine the image coordinates to achieve sub-pixel precision, as will be presented in the next two sections.
2.2. Mathematical Model
The concept of the proposed technique for cylindrical surface reconstruction uses a geometric transformation T with a further refinement by an adaptive least squares matching (ALSM) to accurately map a point from an image I1(x, y) to its respective correspondence in an image I2(x, y) with sub-pixel precision.
Because the camera displacement is nearly parallel to the cylindrical axis, the mapping
I1(
x,
y) →
I2(
x,
y) in the x-direction (horizontal) can be made by an affine transformation, which is sufficient to solve small differences because the variations are neglected. In the y-direction, i.e., (vertically), the cylindrical surface produces a perspective effect that is similar to a conic section, which requires a non-linear transformation; this step can be modelled as a parabola. Thus, the mapping in y is made by an affine transformation with the addition of a second-order term, as shown in Equation (1):
where
a1 and
b1 are translations;
a2,
a3,
b2 and
b3 denote parameters that fit the shape (rotation, scales in
x and
y, and shear);
b4 represents the concavity of the curve; and
I′
2(
x′,
y′) are the coordinates transformed from a point
I1(
x,
y). If the displacement is performed in the x-direction, a similar model could be used, but introducing an additional parameter in the first line in Equation (1).
The parameters of the geometric function
T in Equation (1) can be estimated by least squares adjustment [
38], provided that a minimum of four suitable corresponding points have been established between the two regions. The pixel coordinates of the image patch of the cylindrical surface in
I1 are used as observations in the system of equations, and the seven parameters (
ai and
bi) are the unknowns to be estimated. After estimating the parameters, the transformation
T is applied to the cylindrical region of
I1 to generate coordinates that must be re-sampled over a new grid of discrete coordinates.
This preliminary transformation provides a good approximation between the corresponding regions, minimising the geometric distortions caused by the orientation and depth (curves are reshaped to near-linear forms), which enables the application of the final image matching procedure with ALSM refinement in a following step.
2.3. Refinement Using ALSM
Let
f(
x,
y) and
g(
x,
y) denote conjugated image patches of a region over a cylindrical surface after a preliminary geometric transformation. The problem of image matching based on ALSM can be solved by estimating the transformation parameters to match the region
g(
x,
y) with
f(
x,
y). However, due to random effects between the regions, an error vector
e(
x,
y) is added to establish the image matching model, as seen in Equation (2):
The position values of
g(
x,
y) must be determined to provide a match point. An approximation to at least a few pixels is required, which typically is accomplished by a correlation coefficient and, in this case, by an initial transformation. Next, the refinement using the transformation
T in Equation (1) with ALSM is performed based on the technique proposed by Gruen [
19]. Because the adjustment of one image region to another is a non-linear problem, the seven parameters are linearized by a Taylor series expansion. The differentiation yields Equation (3):
In addition to these seven parameters, two radiometric parameters,
r1 and
r2 (the brightness and contrast, respectively), are included to form the system of linearized equations in Equation (4) in which
g′(
x,
y) is the re-sampled image region from
g(
x,
y):
For simplification, the partial derivatives of
g(
x,
y) in Equation (4), which correspond to image gradients, can be represented by
gx and
gy, as shown in Equation (5):
The digital number values and coordinates of all corresponding pixels from the regions are used together in Equation (4). Thus, a linearized system with nine parameters in Equation (6) is obtained:
This system can be solved iteratively by the least squares method using the adjustment proposed by Gruen [
19], but we consider in this case an additional parameter, db
4. After completing the ALSM procedure, all pixels of the conjugated region have their image coordinates matched between the two images with sub-pixel precision. Then, 3D surface reconstruction can be achieved by a photogrammetric intersection procedure with collinearity equations (using intrinsic and extrinsic parameters of cameras). The determination of the 3D coordinates is continually performed pixel by pixel for the cylindrical surface in a local reference system.
In summary, the following steps are used in this procedure:
Initial approximation between the conjugated regions with an extraction of distinguishable corresponding points (or keypoints) from the cylindrical surface. These points are used to estimate the seven parameters (
ai and
bi) of the geometric transformation via least squares adjustment. Some kind of texture is required to extract points that describe the surface curvature, which could be produced either by a speckle painting [
33] or by structured light projection.
From the matched points, a window is opened around the conjugated region, which covers the cylindrical diameter.
The image coordinates of the image patch that has the curve effect is re-sampled by the transformation function that performs the first approximation of the image coordinates with a discrepancy of a few pixels.
The correlation coefficient is computed between the image patches to determine a match point at the pixel level, although some distortions are still noticeable.
An image matching refinement by iterative ALSM is performed to fine-tune the shape and estimate the sub-pixel coordinates of all points in the region.
The continuous 3D reconstruction for all points inside the region is made with a photogrammetric intersection.
The presented technique was implemented and tested. In the following sections, the technical applications and results will be presented, as well as the validation and accuracy assessments achieved with the 3D reconstructions.
4. Analysis of the Results
The analysis of the results was based on the comparison between the diameter values estimated by both photogrammetry and direct measurement with a calliper (σ = 0.05 mm). The accuracy assessment provided by the proposed technique was determined by the difference between these two measurements. To calculate the cylinder diameter from the 3D reconstruction, a circle was adjusted to the XY coordinates. The estimated points were used as observations in the circle fitting by the least squares method.
A circle in 2D can be described by three independent parameters in which (
Xc,
Yc) are the coordinates of the circle centre and
r represents its radius. A point (
Xi,
Yi) on the circle satisfies Equation (7) [
38]:
The circle fitting was solved by the iterative least squares method, adopting the centroid of the planimetric coordinates and its mean radius as initial approximations to
Xi,
Yi and
r. The diameter assessment with circle fitting was first performed using the strip with black and white squares over the cylinder. A comparison was also performed with the results produced by three 3D point determination techniques (see
Figure 9). In all cases, the corners were automatically extracted from the two images and used as follows:
Photogrammetric intersection: using only the 3D corner coordinates determined by the intersection of rays with previously estimated extrinsic parameters.
Bundle block adjustment (BBA): using only the 3D corner coordinates determined by bundle adjustment. In this case, the image coordinates of the corners were defined as tie points in the set of 52 images used in the camera calibration procedure.
ALSM + intersection: using all 3D points (only planimetric coordinates) of the central line of the reconstructed strip, which were generated by a photogrammetric intersection with the proposed technique.
Table 4 presents the standard deviation (σ) of the three parameters that result from the circle fitting. In the three techniques, the values were less than 1 mm, notably when the combination of ALSM + intersection was used, which resulted in the smallest standard deviation values. The dense point sequence produced the smallest dispersions. The accuracy was also calculated by the difference between the estimated and measured diameters. The photogrammetric intersection and BBA techniques indicated small differences of 0.20 mm and 0.27 mm, respectively, whereas ALSM + intersection obtained the most accurate result (0.13 mm), achieving approximately 1/10 mm for a camera-to-the-object distance of 1.10 m and a base/depth ratio of 0.21.
The graph in
Figure 10 displays the residuals that result in the XY coordinates after the circle fitting using 335 points of an arc segment generated by the ALSM + intersection techniques. The largest residuals were produced in the cylinder borders (>1.5 mm), where the deformation is expected to be larger. In the central part, the residuals were less than 0.9 mm. Furthermore, the residuals in both coordinates had an alternating behaviour around zero.
The reconstruction accuracy was also assessed for the textured areas presented in
Section 3.5, where two image patches (above and below the strip) were considered in the experiments. For this case, two assessments on the measurement errors were made: a first using only the keypoints automatically extracted with the SIFT technique and the other by using the continuous reconstruction technique. In the first assessment, the 3D coordinates of the keypoints were estimated only with a photogrammetric intersection, and then a circle was fitted. In the second, the ALSM + intersection was used, and the XY coordinates of the 3Dpoints from each image patch were used to fit a circle, as shown in
Figure 11a,b.
Table 5 presents the standard deviations and errors that result from the circle fitting in the two textured areas extracted from the cylinder, which were called Part I and Part II. The errors were computed by comparing the estimated diameters with the value that was measured directly with a calliper. When only the SIFT keypoints were used, the diameter error was 0.003 mm in Part I and 0.0079 mm in Part II. The largest standard deviations indicated in the centre circle determination a dispersion of approximately 0.37 mm in X and 2 mm in Y, with 1.74 mm for the diameter. However, more accurate values were obtained with the ALSM technique, which generates errors of 0.0001 mm and 0.0033 m for Parts I and II, respectively. In addition, the standard deviations were also smaller and in both Parts I and II were below 0.20 mm. The results with the textured area were more accurate than those achieved with the checkered strip because in those patches, the gradients are concentrated in a few directions, and the internal areas are homogenous and did not contribute to improving the solution with the modified LSM. It should be noted that the errors that were assessed by comparing the indirect estimates from the images with the direct measure were smaller than the nominal precision of the calliper, which is 0.05 mm. Thus, the error values presented in
Table 5 can be considered to be optimistic, but it can be concluded that the proposed technique can deliver accuracies that are higher than 0.05 mm.
Additional experiments were performed with two metallic cylinders to assess the repeatability of the technique and its accuracy. Both cylinders were firstly painted with matte coating to avoid reflections with black ink randomly sprayed generating a speckle pattern and they were placed on the calibration panel for image acquisition, as shown in
Figure 12. Next, the 3D reconstruction technique was applied as presented with the textured areas. Two calibrated electronic micrometres (Coolant Proof IP65-MX, Mitutoyo, Chicago, IL, USA, both with resolution of 0.001 mm and accuracy of 0.002 mm) were used to measure the diameters at two perpendicular positions.
Table 6 presents the direct measures of the cylinders used as reference and the results after circle fitting. As can be seen, the errors were less than 0.07 mm in the two cylinders and the standard deviations of the estimated circle parameters were less than 0.21 mm. In all cases, the standard deviations in X were smaller than in Y. This effect was caused by the camera displacement that was always performed in the
Y-axis. Consequently, the largest deformations also occurred in the Y-direction.
For an overall assessment, the experiments with the strip and textured areas showed that the image patches were continuously reconstructed in the object space, and the accuracy achieved a sub-millimetre level, which was also confirmed with the metallic cylinders. With regard to the technical feasibility of the approach, since the extrinsic parameters of the images are known, homologous regions of the cylindrical surface can be easily located using the points extracted from the cylinder. The image coordinates of these points can be used in collinearity equations to determine the conjugated regions from which the image patches were extracted. Typically, a 3D reconstruction using photogrammetric procedures requires a large set of images. However, in this approach, only two images were needed to produce a continuous 3D reconstruction. Even regions with homogeneous textures were reconstructed, as was accomplished with the checkered strip. Some guidelines based on the produced experiments can be recommended to guarantee better technical performance:
For determining the geometric transformation function, a narrow strip that covers the cylindrical circumference is sufficient.
Points extracted from the surface for image matching should be distributed along the cylindrical width for modelling its curvature.
Points extracted for image matching should not be aligned to better estimate the geometric transformation parameters.
In addition to these recommendations, it was observed in the experiments that there is no need for a high point density to determine the geometric transformation parameters. This finding was verified by the results produced with the checkered strip (with 20 corners) and the texture area below the strip (with 11 keypoints), which produced an accuracy close to 1/10 mm. The preliminary geometric transformation, which is a fundamental step, performs only the first approximation between the corresponding image patches at the pixel level. The ALSM adjustment is the key step that determines the match points with high accuracy.
5. Conclusions
This paper introduced a technique for continuous 3D reconstruction of cylindrical surfaces. The concept was based on image acquisition with a base path parallel to the axis of the cylindrical object. The concept was that changes in the camera viewpoint in a single direction would generate different cylindrical perspective effects similar to a conic section and that this effect could be modelled by a single additional parameter in a geometric transformation. Then, the different effects could be corrected by a geometric transformation followed by ALSM to produce a 3D object reconstruction using a continuous matching with only two images.
The experiments were conducted with a textured cylinder that included a checkered strip for validation purposes. Synthetic checkered corners were automatically extracted and used in the experiments to define the parameters of a geometric transformation function and also to assess the results. Keypoints that were automatically extracted from textured areas were also used to verify the technical feasibility when well-defined corners were not available.
Since the object of study was a cylinder, the accuracy assessment of the presented technique was based on the error discrepancy in determining the cylindrical diameter using circle fitting by the least squares algorithm. The diameter estimated with the proposed technique was compared to the value directly measured in the cylinder for accuracy assessment. The error obtained with the checkered strip resulted in 0.13 mm, and when the textured areas were used, the error was less than 0.012 mm. This difference between the results is due to the background of the areas. In the checkered strip, only the gradient points over the edges contribute to the solution, and several of them are correlated. On the other hand, in the textured areas, there is much variability in all directions, which provides a better refinement in the ALSM. In general, the results demonstrated that the technique achieved an accuracy of approximately 1/10 mm (for the scale and B/D ratio used in this study) considering the cylindrical circumference. Additionally, the objective of performing a continuous 3D reconstruction was achieved.
The proposed technique was tailored for a cylindrical surface, which is a common shape that is difficult to model by conventional photogrammetric techniques. However, several benefits can be envisioned. The technique is entirely developed in automatic mode and achieves sub-millimetre accuracy, if accurate extrinsic parameters are provided. The image acquisition technique is simple to handle due to the use of a single camera. The 3D reconstruction is made with a minimum of two images, and textured areas can be reconstructed. The technique can be used in industrial applications in which accurate measurements are required; in forests where trunks are commonly measured and reconstructed; or even poles in outdoor images. It is important to emphasize that the main focus of this paper was to present the modelling technique with the methodological steps and the resulting accuracies in the cases studied. The accuracy assessment that the technique can achieve in general applications has not been addressed and should be assessed in future works.