A Virtual Multi-Ocular 3D Reconstruction System Using a Galvanometer Scanner and a Camera

Han, Zidong; Zhang, Liyan

doi:10.3390/s23073499

Open AccessArticle

A Virtual Multi-Ocular 3D Reconstruction System Using a Galvanometer Scanner and a Camera

by

Zidong Han

and

Liyan Zhang

^*

College of Mechanical & Electrical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China

^*

Author to whom correspondence should be addressed.

Sensors 2023, 23(7), 3499; https://doi.org/10.3390/s23073499

Submission received: 4 February 2023 / Revised: 21 March 2023 / Accepted: 23 March 2023 / Published: 27 March 2023

(This article belongs to the Special Issue Sensing and Processing for 3D Computer Vision: 2nd Edition)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

A novel visual 3D reconstruction system, composed of a two-axis galvanometer scanner, a camera with a lens, and a set of control units, is introduced in this paper. By changing the mirror angles of the galvanometer scanner fixed in front of the camera, the boresight of the camera can be quickly adjusted. With the variable boresight, the camera can serve as a virtual multi-ocular system (VMOS), which captures the object at different perspectives. The working mechanism with a definite physical meaning is presented. A simple and efficient method for calibrating the intrinsic and extrinsic parameters of the VMOS is presented. The applicability of the proposed system for 3D reconstruction is investigated. Owing to the multiple virtual poses of the camera, the VMOS can provide stronger constraints in the object pose estimation than an ordinary perspective camera does. The experimental results demonstrate that the proposed VMOS is able to achieve 3D reconstruction performance competitive with that of a conventional stereovision system with a much more concise hardware configuration.

Keywords:

galvanometer; virtual camera; multi-ocular; calibration; 3D reconstruction

1. Introduction

Vision-based three-dimensional (3D) reconstruction has been among the most popular research fields for years. This technique endeavors to recover the 3D information of spatial objects or scenes from one or multiple images. It has been widely applied in many flourishing fields, such as auto driving, industrial inspection, building modeling, etc.

Generally, the implementation of 3D reconstruction of unknown objects requires images taken from different perspectives, which may be synchronously captured by two or more spatially distributed cameras. The 3D reconstruction devices with two or more cameras are usually called binocular or multi-ocular 3D vision systems, respectively. For example, Fu et al. proposed an optimization method based on the correspondence field (CF) to determine a proper camera arrangement in a dual-camera system, which turned out to be effective for improving the reconstruction accuracy [1]. Shih et al. produced high-resolution depth maps of the scenes utilizing a multi-camera system, which consists of a wide-angle camera and an array of telescopic cameras [2]. However, dual-camera and multi-camera systems usually occupy a large spatial layout and have limitations in some scenarios. Different perspectives images can also be successively achieved by one moving camera. Then under the framework of structure from motion (SfM) [3] the spatial information of the scenes can be reconstructed from the images captured over time. For example, Gao et al. designed a set of multi-view 3D reconstruction technology based on SFM and obtained the external parameters of the camera and the sparse point-cloud model [4]. Wang et al. integrated the fringe light projection with SfM to realize a global dense point cloud reconstruction system [5]. These SfM-based methods can produce accurate 3D point clouds for large-scale scenes; however, capturing images one after another is time consuming and inconvenient for many applications.

An alternative approach to 3D reconstruction is to introduce an optical light deflection device in front of the camera. For example, Zhang et al. proposed a system consisting of a single camera and a fixed mirror to reconstruct the 3D object [6]. The correspondence between the object and its counterpart in the mirror is auto-epipolar in one image taken by the camera; therefore, the object can be computed in a way similar to using a dual-camera system. Yu et al. proposed a camera–mirror-based stereo digital image correlation (DIC) system for 3D shape and deformation measurement. The four fixed mirrors in the DIC system divide the camera’s field of view (FOV) into two parts, which observe the object from two different angles and constitute a binocular 3D reconstruction system [7]. By utilizing fixed mirror(s), these methods effectively formed the geometrical constraints for 3D reconstruction. However, the image plane had to be divided into two or more parts in these methods to capture the stereo views simultaneously; thus, the effective FOV and the spatial resolution were much reduced. Recently, some adjustable light deflectors, such as the transparent rotating deflector (TRD) mounted on a stepping motor [8], the optical plate rotated with electrothermal MEMS actuator [9], and the rotatable wedge prism [10], have been utilized for changing the camera boresight. The optical model and the 3D reconstruction method based on these boresight-variable imaging systems have been extensively investigated. However, the reported 3D reconstruction accuracy results achieved with these boresight-variable imaging systems so far are much lower than that of the conventional stereopsis [9,10,11].

The galvanometer scanner is another type of light deflection device, which consists of two optical reflection mirrors fixed on their rotation shafts. It features high speed and high accuracy for light deflection [12,13] and has been widely applied in laser scanning for material processing [14,15], medical treatment [16,17], and laser indication [18,19]. The galvanometer scanner was also combined with the camera to enlarge the FOV of the vision system for surveillance [20], large-scale structure inspection [21], high-speed moving object tracking [22], etc. Owing to the ultrafast light defection capability, the galvanometer scanner was also applied in image capturing for motion blur compensation [23,24]. The above galvanometer–camera imaging systems (GCIS) mainly work for capturing clear detail images in a large FOV or close-up image tracking of fast-moving objects, not for 3D reconstruction. To enable the GCIS to be conveniently applied in 3D vision tasks as a conventional camera, we investigated the imaging model and the calibration method of the GCIS in an early work [25], in which a single hidden layer feedforward neural network was utilized to establish the imaging relation of the GCIS to incorporate the mirror-deflection-induced boresight change. Since the GCIS was regarded as a single imaging device with an enlarged FOV in Ref. [25], two or more GCISs are needed to reconstruct the 3D coordinate information of the spatial points. Some other galvanometer-based 3D reconstruction systems have also been proposed. Hegna et al. proposed a 3D reconstruction system combing a laser rangefinder and a galvanometer [26]. Shi et al. proposed a time of flight (ToF)-based laser–galvanometer 3D reconstruction system [27].

In this paper, we model the combination of one galvanometer and one camera as a virtual multi-ocular system (VMOS). The configuration and working mechanism with definite physical meaning are presented. A simple and efficient method for calibrating the intrinsic and extrinsic parameters of the virtual multi-ocular system is put forward. The applicability of the proposed system for 3D reconstruction and pose estimation is investigated, and the results demonstrate that the proposed method can achieve competitive accuracy with that of conventional multi-view stereopsis with a much more concise hardware configuration.

2. Methodology

2.1. Configuration and Construction of the VMOS

The configuration of the proposed galvanometer–camera combined VMOS is shown as Figure 1a. The VMOS consisted of a galvanometer scanner, a camera with an appropriate lens, and a set of control units. The galvanometer scanner was fixed in front of the camera, and the control unit was used to control the camera and the galvanometer scanner simultaneously to take pictures when the galvanometer deflected to a specified position. The lights of the scene were deflected twice by the two mirrors in the galvanometer scanner and then captured by the camera sensor through the lens. By changing the turning angles of the two mirrors, the camera boresight and FOV could be adjusted, as shown in Figure 1b.

According to the principle of mirror transformation, changing the camera’s field of view through the mirror deflections is equivalent to changing the camera’s pose (including the position and direction), as shown in Figure 2.

In Figure 2,

O - X Y Z

is the camera coordinate system, which represents the pose of the real camera. The rotation angles of Mirror-1 and Mirror-2 are denoted as

α

and

β

, respectively, which are uniquely determined by a pair of control values

D (a, b)

. Suppose Mirror-1 and Mirror-2 are at the initial turning angles, then

O_{1}^{'}

is the virtual camera position, which is specularly transformed from

O

with Mirror-1, and

O_{1}^{″}

is the virtual camera position, which is specularly transformed from

O_{1}^{'}

with Mirror-2 in the initial status. The boresight of the real camera is identically transformed and marked as the blue dotted lines. When the turning angles

α

and

β

are changed to an arbitrary status, the corresponding virtual camera positions induced by the two mirror transformations are denoted as

O_{2}^{'}

and

O_{2}^{″}

, respectively, and the virtual camera boresight in this status is marked as the red dotted lines.

To sum up, the virtual camera pose was related to the deflection angles

α

and

β

, the distance between the rotation axes of the two mirrors, and the relative installation pose between the real camera and the galvanometer scanner. However, it is not trivial to directly calculate the pose matrices of the virtual cameras in practice for the following reasons: (1) The turning angles

α

and

β

are determined by a pair of control parameters

a

and

b

, respectively. The non-linear mapping between

(α, β)

and the control parameter

D (a, b)

needs to be carefully calibrated, and the calibration errors of

D (a, b) \to (α, β)

may reduce the accuracy of the calculated virtual camera poses. (2) The distance between the rotation axes of two mirrors is determined by the manufacturing process of the galvanometer scanner and is difficult to accurately measure in practice. (3) The relative installation pose between the camera and the galvanometer scanner is hard to know.

Instead of trying to calculate the virtual camera poses through specular reflection transformation, we enabled the galvanometer–camera to work as a virtual multi-ocular system, which needed to know neither the nonlinear relation

D (a, b) \to (α, β)

, the rotation axes distance of the two mirrors, nor the installation pose of the camera. This scheme took advantage of the high repeatability of the galvanometer scanner. Specifically, the high repeatability of the scanner meant that whenever a specific control parameter

D (a, b)

was transmitted to the scanner, the corresponding deflection angles

α

and

β

almost remained unchanged every time, and hence, the imaging area of the system was all the same. In other words, given control parameter

D (a, b)

, the pose of the virtual camera

V

was definitely determined. Therefore, we sampled the 2D control parameter domain in advance and endeavored to calibrate the corresponding virtual poses that corresponded to the sampled parameters. A one-to-one mapping

D \to V

from the sampled control parameters

D

to the corresponding virtual camera

V

was established. All the virtual cameras constituted the virtual multi-ocular system.

In order to perform the camera imaging within the deflection range of the galvanometer scanner, the camera and the galvanometer scanner should be properly configured to guarantee that the view pyramid of any virtual camera resulting from the deflection of Mirror -1 should intersect with Mirror-2, as shown in Figure 3.

More specifically, the parameters of the galvanometer–camera combination should meet the following condition:

\cot (2 α - \frac{θ}{2}) (\sin 2 α * O R_{1} + R_{1} R_{2}) - \cos 2 α * O R_{1} \leq \frac{W}{2},

(1)

where

α

is the turning angle of Mirror-1,

θ

is the FOV angle of the camera, W is the width of Mirror-2,

O

is the optical center point of camera,

O^{'}

is the optical center point of virtual camera formed by Mirror-1,

R_{1}

is the center point of Mirror-1, and

R_{2}

is the center point of Mirror-2.

To guarantee that each virtual camera in the VMOS shared common FOVs with some of the others, the sampling numbers of control parameter

D (a, b)

should satisfy

\{\begin{cases} N_{a} \geq \frac{2 α_{\max}}{θ_{y} / 2} \\ N_{b} \geq \frac{2 β_{\max}}{θ_{x} / 2} \end{cases},

(2)

where

N_{a}

and

N_{b}

are the least sampling numbers of the control parameters

a

and

b

, respectively;

α_{\max}

and

β_{\max}

are the maximum turning angles of Mirror-1 and Mirror-2, respectively; and

θ_{x}

and

θ_{y}

are the camera FOV angle in the horizontal and vertical directions, respectively. Having determined

N_{a}

and

N_{b}

, the 2D control parameter domain is evenly sampled. Then we have a number of

S = N_{a} \times N_{b}

virtual cameras corresponding to the sampled control parameters

D {(a, b)}_{s} (s = 1, 2, \dots, S)

. The virtual cameras are denoted as

V_{s} (s = 1, 2, \dots, S)

.

The above control parameter sampling rule can ensure that the adjacent virtual cameras share common FOVs. Most viewable regions of the VMOS largely have fourfold overlap, as shown in Figure 4. The bigger the sampling numbers

N_{a}

and

N_{b}

are, the more folds the viewing regions overlap, and the more constraints can be supplied for 3D reconstruction.

2.2. Calibration Method of the VMOS

According to Section 2.1, the VMOS was composed of a number of

S = N_{a} \times N_{b}

virtual cameras corresponding to the sampled control parameters

D {(a, b)}_{s} (s = 1, 2, \dots, S)

. Since all the virtual cameras were induced from the same real camera, the intrinsic parameters, including the pinhole imaging matrix and the distortion parameters, were the same for each virtual camera, while the poses of all the virtual cameras

V_{s} (s = 1, 2, \dots, S)

needed to be calculated.

Due to the large FOV of the VMOS, the calibration was difficult to realize by a calibration target at once. We proposed a global optimization method for zonal calibration, combining Zhang’s camera calibration method [28], the PnP (perspective-n-point) method [29], and the bundle adjustment (BA) method [30]. The main steps are summarized in Figure 5.

To realize the calibration method, we built a planar calibration target, which was evenly distributed with coded points. The identifications of each coded points in the images could be easily recognized by decoding. Denote the calibration target coordinate system as C-CS, and the coordinates of the coded points in C-CS are denoted as

X^{i d}

, where the superscript

i d

represents the identification of a coded point. The specific steps of proposed calibration method are as follows:

For image collection and calibration data preparation, put the calibration target at position

T_{p} (p = 1, 2, \dots, P)

in the working volume of the VMOS. Capture images

I_{s p} (s = 1, 2, \dots, S, p = 1, 2, \dots, P)

of the target in position

T_{p}

with the virtual camera

V_{s}

. Then extract the image coordinates

x_{s p}^{i d}

of the coded points in image

I_{s p}

. The 3D coordinates under a global coordinate system (G-CS) of the coded points

X_{p}^{i d}

on the calibration target in position

T_{p}

are measured utilizing a photogrammetric device.

For the calibration of the camera intrinsic parameters, among images $I_{s p}$ with different index s and fixed index $p$ , match the 3D points $X_{p}^{i d}$ with the image points $x_{s p}^{i d}$ . Take the matched pairs $X_{p}^{i d} \leftrightarrow x_{s p}^{i d}$ into Zhang’s monocular camera calibration process [28,31] for calibrating the intrinsic matrix $K$ in the pinhole camera model as shown in Equation (3) and the distortion parameters $(k_{1}, k_{2}, k_{3}, k_{4}, k_{5})$ expressed in Equation (4).

$λ [\begin{matrix} u \\ v \\ 1 \end{matrix}] = K [R | t] [\begin{matrix} x \\ y \\ z \\ 1 \end{matrix}]$

(3)

where ${[x, y, z, 1]}^{T}$ is the homogeneous coordinates of the spatial point, ${[u, v, 1]}^{T}$ is the ideal homogeneous pixel coordinates of the corresponding point, $[R | t]$ is the pose parameters of the camera, and $λ$ is the depth coefficient.

$\{\begin{cases} u_{d} = u + u (k_{1} r^{2} + k_{2} r^{4} + k_{3} r^{6}) + [2 k_{4} u v + k_{5} (r^{2} + 2 u^{2})] \\ v_{d} = v + v (k_{1} r^{2} + k_{2} r^{4} + k_{3} r^{6}) + [k_{4} (r^{2} + 2 v^{2}) + 2 k_{5} u v] \end{cases},$

(4)

where $u_{d}$ and $v_{d}$ are the observed pixel coordinates with distortion corresponding to the ideal coordinates $u$ and $v$ , respectively; $r$ is the distance between the pixel point $(u, v)$ and the principle point of the pixel plane; $k_{1}$ , $k_{2}$ , and $k_{3}$ are the radial distortion parameters; and $k_{4}$ and $k_{5}$ are the tangential distortion parameters.
For the calibration of the virtual camera poses, to calculate the sth virtual camera pose, gather the coded points $x_{s p}^{i d}$ in the images $I_{s p}$ as a group $G_{s}$ with the same index $s (s = 1, 2, \dots, S)$ and different index $p$ . Match the image points $x_{s p}^{i d}$ in each $G_{s}$ with $X_{p}^{i d} (p = 1, 2, \dots, P)$ according to index $p$ and $i d$ . Utilizing the matched pairs $X_{p}^{i d} \leftrightarrow x_{s p}^{i d}$ in the specific group $G_{s}$ , the pose of the virtual camera $V_{s}$ $(s = 1, 2, \dots, S)$ , i.e., the transformation matrix $[R_{s} | t_{s}]$ from G-CS to the virtual camera coordinate system $V_{s}$ -CS, is calculated through the PnP method [29].
For global optimization, to improve the calibration accuracy, the BA method [32] is applied to optimize the intrinsic parameters and all the virtual camera poses. In consideration of the lens distortion, we add radial distortion and tangential distortion to the BA model. The objective function of the nonlinear optimization is

$δ (K, [R_{s} | t_{s}], k_{1}, k_{2}, k_{3}, k_{4}, k_{5}) = \sum_{p = 1}^{P} \sum_{s = 1}^{S} \sum_{}^{i d} {‖r e p r o j e c t i o n_{s p}^{i d} - x_{s p}^{i d}‖}_{2},$

(5)

where $r e p r o j e c t i o n_{s p}^{i d}$ is the reprojection pixel coordinates of spatial point $X_{p}^{i d}$ in virtual camera $V_{s}$ calculated through Equations (3) and (4).

Figure 6 shows the schematic diagram of the entire calibration process. Finally, the intrinsic matrix

K

, the distortion parameters

(k_{1}, k_{2}, k_{3}, k_{4}, k_{5})

, and the extrinsic matrices

[R_{s} | t_{s}] (s = 1, 2, \dots, S)

of the virtual cameras were determined.

2.3. The 3D Reconstruction Method with the VMOS

Having completed the VMOS calibration, the intrinsic matrix

K

, the distortion parameters

(k_{1}, k_{2}, k_{3}, k_{4}, k_{5})

, and the extrinsic pose matrices

[R_{s} | t_{s}]

of all the virtual cameras

V_{s} (s = 1, 2, \dots, S)

were obtained. The control parameter sampling rule described in Section 2.1 guarantees that the scene in the working volume of the VMOS can be observed by largely four or more virtual cameras. According to the triangulation method, the region observed by multiple virtual cameras can be 3D reconstructed, shown as Figure 7.

In Figure 7, the image point

m_{1}

corresponding to the spatial point M can be expressed as

λ_{1} [\begin{matrix} u_{1} \\ v_{1} \\ 1 \end{matrix}] = K [R_{1} | t_{1}] [\begin{matrix} X \\ Y \\ Z \\ 1 \end{matrix}],

(6)

where

[R_{1} | t_{1}]

is the

3 \times 4

extrinsic pose matrix of

V_{1}

;

{[X, Y, Z, 1]}^{T}

is the homogeneous coordinates of M in the world coordinate system;

{[u_{1}, v_{1}, 1]}^{T}

is the undistorted pixel coordinates of M in pixel coordinate system of virtual camera

V_{1}

, which can be calculated from the observed image coordinates with Equation (4); and

λ_{1}

is the depth coefficient of point M in the coordinate system of virtual camera

V_{1}

. By eliminating

λ_{1}

, Equation (6) can be reorganized as

[\begin{array}{l} u_{1} {K [R_{1} | t_{1}]}^{(3)} - {K [R_{1} | t_{1}]}^{(1)} \\ v_{1} {K [R_{1} | t_{1}]}^{(3)} - {K [R_{1} | t_{1}]}^{(2)} \end{array}] [\begin{matrix} X \\ Y \\ Z \\ 1 \end{matrix}] = 0,

(7)

where

{[R_{1} | t_{1}]}^{(i)} (i = 1, 2, 3)

represents the ith row of matrix

[R_{1} | t_{1}]

.

According to Equation (7), one camera can provide a 2 × 4 coefficient matrix. When there are

n (n \geq 2)

cameras having observed the target point, an overdetermined linear system shown in Equation (8) can be obtained:

[\begin{array}{c} u_{1} {K [R_{1} | t_{1}]}^{(3)} - {K [R_{1} | t_{1}]}^{(1)} \\ v_{1} {K [R_{1} | t_{1}]}^{(3)} - {K [R_{1} | t_{1}]}^{(2)} \\ u_{2} {K [R_{2} | t_{2}]}^{(3)} - {K [R_{2} | t_{2}]}^{(1)} \\ v_{2} {K [R_{2} | t_{2}]}^{(3)} - {K [R_{2} | t_{2}]}^{(2)} \\ ⋮ \\ u_{n} {K [R_{n} | t_{n}]}^{(3)} - {K [R_{n} | t_{n}]}^{(1)} \\ v_{n} {K [R_{n} | t_{n}]}^{(3)} - {K [R_{n} | t_{n}]}^{(2)} \end{array}] [\begin{matrix} X \\ Y \\ Z \\ 1 \end{matrix}] = 0 .

(8)

Perform singular value decomposition (SVD) [30] on the coefficient matrix. The 3D coordinates

(X, Y, Z)

can be obtained from the singular vector corresponding to the minimum singular value.

2.4. Pose Estimation Method Using the VMOS

Object pose estimation is one of the most common applications of machine vision. The PnP algorithm is the most common means of monocular pose estimation [33,34]. Through the 2D image coordinates observed by one camera and the corresponding known 3D coordinates of the target, the transformation between the object coordinate system and the camera coordinate system can be calculated. Then the six degree of freedom (DOF) pose parameters of the object with respect to the camera coordinate system can be obtained.

However, due to the limited field of view of each virtual camera, it may be impossible to obtain enough points for the PnP calculation in a single perspective. In addition, the pose calculated by PnP from a single view is in the current virtual camera coordinate system, which needs to be converted to the VMOS coordinate system using the pose parameters of each virtual camera, which is cumbersome and inconvenient. Fortunately, the proposed VMOS could observe the same object point by different virtual cameras, which had potential to provide more constraints for determining the object pose compared with ordinary cameras. Taking advantage of the large FOV of the VMOS, we proposed a global pose estimation algorithm to directly obtain the object pose in the VMOS coordinate system by utilizing the images from multiple virtual cameras.

In our pose estimation scheme with the VMOS, not all the virtual cameras but only those having observed the feature points for the pose estimation participated in the calculation. As shown in Figure 8, suppose the calibrated virtual camera

V_{i}

observes a point

P_{j}

in the object coordinate system (O-CS) concerning the pose estimation, and the corresponding undistorted pixel coordinates

(u_{i j}, v_{i j})

are obtained. Then,

(u_{i j}, v_{i j})

can be transformed to the normalization plane in

V_{i} - C S

according to (9).

[\begin{matrix} u_{i j} \\ v_{i j} \\ 1 \end{matrix}] = K [\begin{matrix} x_{i j} \\ y_{i j} \\ 1 \end{matrix}],

(9)

where

{[x_{i j}, y_{i j}, 1]}^{T}

is the coordinates of point

p_{i j}

, which is on the normalization plane in

V_{i} - C S

corresponding to

P_{j}

.

The correspondence between spatial point

P_{j}

and the line

O_{i} p_{i j}

was established. Utilizing the extrinsic matrix

[R_{i} | t_{i}]

, the line

O_{i} p_{i j}

was transformed from

V_{i} - C S

to the VMOS coordinate system (i.e.,

G - C S

), as shown in Equations (10) and (11).

{\vec{l}}_{i j}^{G} = R_{i}^{- 1} {\vec{l}}_{i j}

(10)

O_{i}^{G} = - R_{i}^{- 1} t_{i}

(11)

where

{\vec{l}}_{i j}^{G}

is the normalized orientation vector of line

O_{i} p_{i j}

in G-CS,

O_{i}^{G}

is a passing point of line

O_{i} p_{i j}

in G-CS, and

{\vec{l}}_{i j}

is the normalized orientation vector of

O_{i} p_{i j}

in

V_{i} - C S

,

{\vec{l}}_{i j} = \frac{(x_{i j}, y_{i j}, 1)}{‖(x_{i j}, y_{i j}, 1)‖} .

(12)

Given N pairs of 3D point–line correspondence which formed by virtual cameras

V_{i} (i = 1, 2, \dots, I)

observing points

P_{j} (j = 1, 2, \dots, J)

on an object, the pose estimation can be modeled as the non-perspective PnP (NPnP) [35] problem depicted as

ζ_{i j} {\vec{l}}_{i j}^{G} + O_{i}^{G} = {}_{O}^{G}R P_{j} + {}_{O}^{G}t for i = 1, 2, \dots, I; j = 1, 2, \dots, J

(13)

where

ζ_{i j}

is the parameter of line

O_{i} p_{i j}

and

[_{O}^{G} R| {}_{O}^{G}t]

is the transformation matrix from O-CS to G-CS.

We utilized the procrustean solution provided in [35] to estimate the transformation matrix

[_{O}^{G} R| {}_{O}^{G}t]

in Equation (13). After obtaining the result, a BA optimization was performed to improve the accuracy. Taking

[_{O}^{G} R| {}_{O}^{G}t]

as the initial value, we minimized the reprojection errors expressed in Equation (14) to finally obtain the object pose parameters.

δ ([_{O}^{G} R| {}_{O}^{G}t]) = \sum_{j} \sum_{i} {‖({\tilde{u}}_{i j}, {\tilde{v}}_{i j}) - (u_{i j}, v_{i j})‖}_{2}

(14)

where

({\tilde{u}}_{i j}, {\tilde{v}}_{i j})

is the reprojection pixel coordinates of the spatial point

P_{j}

in virtual camera

V_{i}

and can be expressed in Equation (15).

[\begin{matrix} {\tilde{u}}_{i j} \\ {\tilde{v}}_{i j} \\ 1 \end{matrix}] = K [R_{i} | t_{i}] [_{O}^{G} R| {}_{O}^{G}t] [\begin{matrix} X_{j} \\ Y_{j} \\ Z_{j} \\ 1 \end{matrix}] .

(15)

The Gauss–Newton iteration method was used to minimize the objective function in Equation (14).

3. Experiments

3.1. Hardware Setup

The VMOS used in our experiments is shown in Figure 1a. Specifically, the hardware contained a CCD camera with a lens and a galvanometer with two reflection mirrors. The camera used was an GT-2450 from AVT in Germany, whose resolution is 2448 × 2050 pixels, and the size of each pixel is 3.45 µm × 3.45 µm. The lens installed on the camera was a LM50-JC1MS from Kowa in Japan with a 50 mm fixed focal length. The nominal repeatability of the used galvanometer was <8 µrad. The planar calibration target used in the experiments was a piece of glass, 2.4 m × 1.2 m in size and with a set of coded markers [36] pasted on, as shown in Figure 9. Each coded marker had eight white circular dots distributed in a specific pattern on a black background. From the unique distribution pattern of each marker, the marker identity and in turn the identity of each white dot could be recognized. This helped to establish the correspondences between the points on the board and the points in the images. The program for completing the whole experimental process was written in C++ and installed in a personal computer with an Core i5-9400F @3.7 GHz CPU from Intel in America, 16 G RAM, and a Windows 10 Enterprise edition operating system from Microsoft in America.

3.2. Calibration Experiment

3.2.1. Galvanometer Repeatability Verification

Since the proposed VMOS calibration method was based on the high repeatability characteristic of the galvanometer scanner, we first conducted an experiment to verify the repeatability of the galvanometer.

We placed the VMOS about 2.5 m in front of a wall pasted with nine coded markers, as shown in Figure 10. The galvanomirrors in the VMOS were deflected so that the camera could observe each of the coded markers, and the corresponding control parameters of the galvanometer were recorded as

D {(a, b)}_{s} (s = 1, 2, \dots, 9)

. We then repeatedly input the recorded

D {(a, b)}_{s} (s = 1, 2, \dots, 9)

to the VMOS five times and captured five images for each coded marker. A total of 45 images were captured.

The pixel coordinates of the coded point centers in the images were extracted. The deviations among the five pixel coordinates of the same coded point center corresponding to the same

D {(a, b)}_{s}

were calculated. The mean absolute (MA) and root mean square (RMS) deviations in the pixel are shown in Table 1.

The repeatability of galvanometer could be calculated by Equation (16).

δ = \arctan (\frac{ε \cdot p}{f}),

(16)

where δ is the repeatability of the galvanometer, ε is the coded point center deviation in pixel, p is the pixel size of the camera sensor, and f is the focal length of the camera lens.

According to the results in Table 1, the repeatability of the galvanometer calculated by Equation (16) was 4.62 µrad, which conformed with the nominal value.

3.2.2. VMOS Calibration

After verifying the galvanometer repeatability, a calibration experiment was conducted. The camera’s horizontal and vertical viewing angles

θ_{x}

and

θ_{y}

were 9.6

°

and 8.0

°

, respectively. The maximum turning angles of the galvanomirrors were both about 20

°

. Referring to Equation (2), the sampling numbers

N_{a}

and

N_{b}

of the control parameter domain

D (a, b)

should be larger than 10 and 9, respectively, to satisfy the fourfold FOV overlap constraint. We set

N_{a} = N_{b} = 21

in the experiments. Therefore, 21 × 21 = 441 control parameter samples were obtained. The angle between adjacent virtual camera boresights was about 2

°

.

The measurement device used for determining the 3D coordinates of the dot centers on the calibration board was an industrial photogrammetry system TriTop^® from GOM Co. in Brunswick, Germany. It needed only a handheld camera with a fixed focal length to take a set of overlapped images of the calibration board from various perspectives to reconstruct the 3D coordinates of the white dot centers. Some coded points placed in the calibration scene were used to establish the global coordinate system G-CS.

The calibration operation processes in the experiment were as follows.

Put the planar calibration target in six positions, i.e.,

T_{p} (p = 1, 2, \dots, 6)

. Start the VMOS and take images of the target in each position with the 441 virtual cameras

V_{s} (s = 1, 2, \dots, 441)

. Measure the coordinates

X_{p}^{i d}

of the coded points on the calibration target with TriTop^®. The measured coded points on the calibration board in the six positions in G-CS are shown in Figure 11.

We chose the calibration target position

T_{1}

to calibrate the intrinsic matrix

K

and the distortion parameters

(k_{1}, k_{2}, k_{3}, k_{4}, k_{5})

following the method described in step 1 in Section 2.2. Then, following step 2 and step 3 in Section 2.2, the transformation matrices

[R_{s} | t_{s}]

of the virtual camera

V_{s}

(s = 1, 2, \dots, 441)

were computed and optimized, and the mapping

D {(a, b)}_{s} \to [R_{s} | t_{s}]

(s = 1, 2, \dots, 441)

was established. Some of the calibrated virtual cameras are shown in Figure 12. The arrow lines represent the boresights of the virtual camera, and the starting points of the lines are the optical centers of the virtual cameras.

We used the reprojection errors of all the spatial coded points on all the virtual cameras to evaluate the calibration accuracy. The MA and RMS of the reprojection errors are listed in Table 2, where the “direction” column includes the error components in the horizontal and vertical directions, as well as the full reprojection error.

According to the calibration results shown in Table 1, the optimized parameters

K

,

(k_{1}, k_{2}, k_{3}, k_{4}, k_{5})

, and

[R_{s} | t_{s}]

(s = 1, 2, \dots, 441)

resulted in about a 5% reduction in the MA reprojection error and about 10% in the RMSE compared with the initial values. The comparison revealed that the optimization process played an important role in improving the calibration accuracy. In this experiment, the size of the calibration target covered by each image was about 500 mm × 400 mm. Owing to the relatively high image resolution (2448 × 2050), the absolute length represented by 1 pixel was about 0.2 mm in the calibration target area. After the optimization, the mean reprojection error was 0.282 pixel, which largely stood for an accuracy of about 0.06 mm in space.

3.3. Experiments on 3D Coordinate Reconstruction

We performed two 3D reconstruction experiments using the calibrated VMOS to verify its applicability in 3D vision tasks.

3.3.1. Reconstruction of a Visual Scale Bar

The visual scale bar is commonly used in photogrammetry to serve as a scale reference. It has visual markers with precise known distances between each other. In this experiment, we reconstructed the 3D coordinates of the 24 white dot centers belonging to three coded markers on a scalar bar to verify the 3D reconstruction capability of the VMOS, as shown in Figure 13. The distance between the VMOS and the scale bar was about 2.5 m.

Following the method in Section 2.3, the coordinates of the 24 points on the three coded markers on the scale bar were reconstructed with the calibrated VMOS, as shown in Figure 14. To evaluate the reconstruction accuracy, the coordinates of the three sets of coded points were measured using TriTop^® from GOM CO. in Brunswick, Germany as well and regarded as the ground truth. Then, the distances between each dot of the No. 0 coded marker and each dot of the No. 1 and No. 2 coded markers were calculated. There were 128 distances calculated in total, which were compared with the ground truth values to evaluate the 3D reconstruction. The absolute mean (AM) and root mean square (RMS) distance errors are shown in Table 3.

3.3.2. Reconstruction of Marker Points on 3D Structure

In this experiment, we utilized the VMOS to reconstruct a 3D structure object. We placed the object in the working space of the VMOS, shown in Figure 15.

Referring to the method in Section 2.3, the coordinates of 19 × 8 coded points on the 3D object surface were reconstructed, as shown in Figure 16. In order to verify the reconstruction accuracy, the reconstructed 3D coordinates of the coded points were compared with the coordinates measured by TriTop^®. The MA and RMS reconstruction errors of the 19 × 8 coded points scattered on the 3D structure are listed in Table 4, including the error components in the x, y, and z directions and the full Euclidean distance errors.

The results in Table 4 show that the errors in the x and y directions were much smaller than that in z direction (i.e., the depth direction). The reason was that the baseline between each virtual camera was much smaller than the target depth, resulting in the uncertainty increase in the depth direction. The related works on 3D reconstruction using a variable boresight imaging system [10,37] also reported relatively lower accuracy in the depth direction. In our experiment, we quantitatively evaluated the MA and RMS reconstruction errors of the coded points, which could be accurately measured to serve as the true values.

3.3.3. Repeatability of Reconstruction Verification

In order to test the repeatability of the VMOS reconstruction, we conducted a repeatability verification experiment. We placed the 3D structure object as shown in Figure 15 at a location in the FOV of the VMOS and reconstructed the coded points on the object by using the VMOS. Then we changed the position of the object and reconstructed the object coded points again. The processes were repeated, and five sets of reconstruction results were obtained, as shown in Figure 17.

To evaluate the repeat accuracy of the 3D reconstruction, we aligned the reconstructed point sets under each position to the other sets, and 10 pairs of aligned point sets were obtained. The distance errors of the corresponding points in each pair of point sets after the alignment were calculated, and the errors are shown in Table 5. The experiment results demonstrated a good repeatability of the VMOS.

3.4. Experiment on Pose Estimation

In this experiment, we demonstrated the applicability of the calibrated VMOS for two objects’ relative pose estimations, which is a frequently faced 3D vision task in many scenarios. Two 3D objects, denoted as Obj A and Obj B, were placed in the FOV of the calibrated VMOS, as shown in Figure 18.

We denoted the coordinate systems of the objects as

O_{a} - C S

and

O_{b} - C S

. The relative pose between

O_{a} - C S

and

O_{b} - C S

was unknown and needed to be determined. There were some coded markers on the surface of each object. In order to obtain a ground truth of the relative pose to evaluate the estimate result, we again used the TriTop^® to measure the coded points on Obj A and Obj B and then calculated the relative pose transformation matrix

T_{A B}

as the ground truth.

There were 174 out of the total 441 virtual cameras participating in the pose estimation in this experiment, as shown in Figure 19a, where dark gray cells indicate the corresponding sampled control parameters. By using the global pose estimation method in Section 2.4, the transformation matrix

T_{A G}

from

O_{a} - C S

to

G - C S

and transformation matrix

T_{B G}

from

O_{b} - C S

to

G - C S

were calculated. As shown in Figure 19b, the blue and orange circles represent the points on Obj A and Obj B, respectively. The blue and orange lines represent the calculated spatial lines in G-CS corresponding to the image points on Obj A and Obj B captured by the virtual cameras. The transformation matrix from Obj A to Obj B obtained using the VMOS was denoted as

T_{A B}^{G}

, which was calculated by Equation (17).

T_{A B}^{G} = T_{B G}^{- 1} T_{A G}

(17)

For comparison, we also estimated the relative pose of Obj A to Obj B using an ordinary camera, as shown in Figure 20. The model of the ordinary camera was GC-2450 from AVT in Germany, which had the same performance parameters as the camera used in the VMOS. The lens installed on the camera was a LM12-JC1MS from Kowa in Japan with a 12 mm focal length, which was the same series with the lens installed in the VMOS. This lens enabled the ordinary camera to obtain a similar FOV to that of the VMOS.

In the ordinary camera pose estimation experiment, we took a PnP solver (SQPnP) [38], which was based on the non-linear sequential quadratic programming method, to calculate the transformation matrix

T_{A C}

and

T_{B C}

from

O_{a} - C S

and

O_{b} - C S

to the camera coordinate system, respectively. Similarly, the transformation matrix from Obj A to Obj B obtained using the ordinary camera is denoted as

T_{A B}^{C}

, which was calculated by Equation (18).

T_{A B}^{C} = T_{B C}^{- 1} T_{A C}

(18)

According to the transformation matrices

T_{A B}^{}

,

T_{A B}^{G}

, and

T_{A B}^{C}

, the 6D pose parameters (rotations around the x, y, and z axes and translations along the x, y, and z directions) were calculated. The errors in the parameters obtained by the VMOS and those obtained by the ordinary camera were compared with the ground truth provided by TriTop^®, as shown in Table 6.

The results in Table 6 illustrate that the errors in the z-direction (depth direction), as well as the overall errors, of the VMOS were much lower than those of the ordinary camera. The VMOS outperformed the ordinary camera in the relative pose estimation. One reason might have been that the multiple virtual cameras of the VMOS provided more constraints, which was conducive to eliminating the uncertainty error in the pose estimation compared with the single ordinary camera. In addition, the VMOS had a much higher resolution than the ordinary camera for the same FOV and was helpful in finding the exact image location of the coded points.

4. Conclusions

A novel 3D reconstruction system, which only consisted of a galvanometer scanner and a single camera with lens, is introduced. The galvanomirrors deflect the incoming rays and induce multiple virtual cameras in different poses. A set of methods for calibrating the intrinsic and extrinsic parameters of the virtual multi-ocular system is presented. The calibrated VMOS is applicable for 3D reconstruction and 6D pose estimation. The experimental results quantitatively demonstrate that the proposed VMOS can achieve 3D reconstruction with the accuracy of about 1 mm. Moreover, the VMOS shows better performance than the ordinary camera in pose estimation. Compared with the traditional multi-ocular system, the VMOS avoids the bulky hardware and complex layout, while achieving a comparable 3D reconstruction performance. Compared with the SfM-based methods, the ultrafast scanning capability enables the VMOS to swiftly capture the target images, which is crucial in many applications.

One focus of future work may be reconfiguring the hardware setup to enlarge the baselines between the virtual cameras in the VMOS, which is promising for further improvement in the reconstruction accuracy. It can be realized by increasing the distance between the camera and the galvanometer scanner under the constraints of Equation (1). In addition, the VMOS is developed for industrial inspection applications at present, which usually concern indoor settings and cooperative visual targets. However, it has potential to be used in non-laboratory environments. Application scenarios in more challenging surroundings can be explored in the future.

Author Contributions

The work described in this article is the collaborative development of all authors. Conceptualization, Z.H. and L.Z.; data curation, Z.H. and L.Z.; formal analysis, Z.H.; funding acquisition, L.Z.; investigation, Z.H. and L.Z.; methodology, Z.H. and L.Z.; project administration, L.Z.; resources, Z.H. and L.Z.; software, Z.H.; supervision, L.Z.; validation, Z.H.; visualization, Z.H.; writing—original draft, Z.H.; writing—review and editing, L.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research and APC was funded by Natural Science Foundation of China, grant number 52075260.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Fu, S.; Safaei, F.; Li, W. Optimization of camera arrangement using correspondence field to improve depth estimation. IEEE Trans. Image Process. 2017, 26, 3038–3050. [Google Scholar] [CrossRef] [PubMed]
Shih, K.-T.; Chen, H.H. Generating high-resolution image and depth map using a camera array with mixed focal lengths. IEEE Trans. Comput. Imaging 2019, 5, 68–81. [Google Scholar] [CrossRef]
Schonberger, J.L.; Frahm, J.-M. Structure-from-motion revisited. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 4104–4113. [Google Scholar]
Gao, L.; Zhao, Y.; Han, J.; Liu, H. Research on multi-view 3D reconstruction technology based on SFM. Sensors 2022, 22, 4366. [Google Scholar] [CrossRef] [PubMed]
Wang, P.; Zhang, L. Self-registration shape measurement based on fringe projection and structure from motion. Appl. Opt. 2020, 59, 10986. [Google Scholar] [CrossRef]
Zhang, Z.Y.; Tsui, H.T. 3D reconstruction from a single view of an object and its image in a plane mirror. In Proceedings of the Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170), Brisbane, QLD, Australia, 20 August 1998; Volume 2, pp. 1174–1176. [Google Scholar]
Yu, L.; Pan, B. Single-camera stereo-digital image correlation with a four-mirror adapter: Optimized design and validation. Opt. Lasers Eng. 2016, 87, 120–128. [Google Scholar] [CrossRef]
Radfar, E.; Jang, W.H.; Freidoony, L.; Park, J.; Kwon, K.; Jung, B. Single-channel stereoscopic video imaging modality based on transparent rotating deflector. Opt. Express 2015, 23, 27661–27671. [Google Scholar] [CrossRef]
Jang, K.-W.; Yang, S.-P.; Baek, S.-H.; Lee, M.-S.; Park, H.-C.; Seo, Y.-H.; Kim, M.H.; Jeong, K.-H. Electrothermal MEMS parallel plate rotation for single-imager stereoscopic endoscopes. Opt. Express 2016, 24, 9667–9672. [Google Scholar] [CrossRef] [Green Version]
Li, A.; Liu, X.; Zhao, Z. Compact three-dimensional computational imaging using a dynamic virtual camera. Opt. Lett. 2020, 45, 3801. [Google Scholar] [CrossRef]
Liu, X.; Li, A. Multiview three-dimensional imaging using a Risley-prism-based spatially adaptive virtual camera field. Appl. Opt. 2022, 61, 3619. [Google Scholar] [CrossRef]
Aylward, R.P. Advances and technologies of galvanometer-based optical scanners. In Proceedings of the Optical Scanning: Design and Application, SPIE, Denver, CO, USA, 2 July 1999; Volume 3787, pp. 158–164. [Google Scholar]
Aylward, R.P. Advanced galvanometer-based optical scanner design. Sens. Rev. 2003, 23, 216–222. [Google Scholar] [CrossRef]
Yin, Y.; Zhang, C.; Zhu, T. Penetration depth prediction of infinity shaped laser scanning welding based on latin hypercube sampling and the neuroevolution of augmenting topologies. Materials 2021, 14, 5984. [Google Scholar] [CrossRef] [PubMed]
Yin, Y.; Zhang, C.; Zhu, T.; Qu, L.; Chen, G. Development of a laser scanning machining system supporting on-the-fly machining and laser power follow-up adjustment. Materials 2022, 15, 5479. [Google Scholar] [CrossRef] [PubMed]
Mazzoli, A. Selective laser sintering in biomedical engineering. Med. Biol. Eng. Comput. 2013, 51, 245–256. [Google Scholar] [CrossRef] [PubMed]
Alasil, T.; Waheed, N.K. Pan retinal photocoagulation for proliferative diabetic retinopathy: Pattern scan laser versus argon laser. Curr. Opin. Ophthalmol. 2014, 25, 164–170. [Google Scholar] [CrossRef]
Tu, J.; Zhang, L. Rapid on-site recalibration for binocular vision galvanometric laser scanning system. Opt. Express 2018, 26, 32608–32623. [Google Scholar] [CrossRef]
Tu, J.; Wang, M.; Zhang, L. A shortcut to marking 3D target curves on curved surface via a galvanometric laser scanner. Chin. J. Aeronaut. 2019, 32, 1555–1563. [Google Scholar] [CrossRef]
Matsuka, D.; Mimura, M. Surveillance system for multiple moving objects. IEEJ J. Ind. Appl. 2020, 9, 460–467. [Google Scholar] [CrossRef]
Aoyama, T.; Li, L.; Jiang, M.; Takaki, T.; Ishii, I.; Yang, H.; Umemoto, C.; Matsuda, H.; Chikaraishi, M.; Fujiwara, A. Vision-based modal analysis using multiple vibration distribution synthesis to inspect large-scale structures. J. Dyn. Syst. Meas. Control 2019, 141, 31007. [Google Scholar] [CrossRef]
Okumura, K.; Ishii, M.; Tatsumi, E.; Oku, H.; Ishikawa, M. Gaze matching capturing for a high-speed flying object. In Proceedings of the SICE Annual Conference 2013, Nagoya, Japan, 14–17 September 2013; pp. 649–654. [Google Scholar]
Hayakawa, T.; Watanabe, T.; Ishikawa, M. Real-time high-speed motion blur compensation system based on back-and-forth motion control of galvanometer mirror. Opt. Express 2015, 23, 31648. [Google Scholar] [CrossRef]
Hayakawa, T.; Moko, Y.; Morishita, K.; Ishikawa, M. Pixel-wise deblurring imaging system based on active vision for structural health monitoring at a speed of 100 km/h. In Proceedings of the Tenth International Conference on Machine Vision (ICMV 2017), Vienna, Austria, 13–15 November 2017; Zhou, J., Radeva, P., Nikolaev, D., Verikas, A., Eds.; SPIE: Vienna, Austria, 2018; p. 26. [Google Scholar]
Han, Z.; Zhang, L. Modeling and calibration of a galvanometer-camera imaging system. IEEE Trans. Instrum. Meas. 2022, 71, 5016809. [Google Scholar] [CrossRef]
Hegna, T.; Pettersson, H.; Laundal, K.M.; Grujic, K. 3D laser scanner system based on a galvanometer scan head for high temperature applications. In Proceedings of the Optical Measurement Systems for Industrial Inspection VII, SPIE, Munich, Germany, 27 May 2011; Volume 8082, pp. 1195–1203. [Google Scholar]
Shi, S.; Wang, L.; Johnston, M.; Rahman, A.U.; Singh, G.; Wang, Y.; Chiang, P.Y. Pathway to a compact, fast, and low-cost LiDAR. In Proceedings of the 2018 4th International Conference on Control, Automation and Robotics (ICCAR), Auckland, New Zealand, 20–23 April 2018; pp. 232–236. [Google Scholar]
Zhang, Z. Flexible camera calibration by viewing a plane from unknown orientations. In Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece, 20–27 September 1999; Volume 1, pp. 666–673. [Google Scholar]
Lepetit, V.; Moreno-Noguer, F.; Fua, P. EPnP: An accurate O(n) solution to the PnP problem. Int. J. Comput. Vis. 2009, 81, 155–166. [Google Scholar] [CrossRef] [Green Version]
Hartley, R.; Zisserman, A. Multiple View Geometry in Computer Vision; Cambridge University Press: Cambridge, UK, 2003; ISBN 978-0-521-54051-3. [Google Scholar]
Zhang, Z. A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 1330–1334. [Google Scholar] [CrossRef] [Green Version]
Li, Y.; Fan, S.; Sun, Y.; Wang, Q.; Sun, S. Bundle adjustment method using sparse BFGS solution. Remote Sens. Lett. 2018, 9, 789–798. [Google Scholar] [CrossRef]
Quan, L.; Lan, Z. Linear N-point camera pose determination. IEEE Trans. Pattern Anal. Mach. Intell. 1999, 21, 774–780. [Google Scholar] [CrossRef] [Green Version]
Gao, X.S.; Hou, X.R.; Tang, J.; Cheng, H.F. Complete solution classification for the perspective-three-point problem. IEEE Trans. Pattern Anal. Mach. Intell. 2003, 25, 930–943. [Google Scholar] [CrossRef]
Fusiello, A.; Crosilla, F.; Malapelle, F. Procrustean point-line registration and the NPnP problem. In Proceedings of the 2015 International Conference on 3D Vision, Lyon, France, 19–22 October 2015; pp. 250–255. [Google Scholar]
Yu, Z.J.; Shi, J. Decoding of dot-distribution coded targets and EO device. Appl. Mech. Mater. 2011, 58–60, 1246–1251. [Google Scholar] [CrossRef]
Liu, X.; Li, A. An integrated calibration technique for variable-boresight three-dimensional imaging system. Opt. Lasers Eng. 2022, 153, 107005. [Google Scholar] [CrossRef]
Terzakis, G.; Lourakis, M. A consistently fast and globally optimal solution to the perspective-n-point problem. In Proceedings of the Computer Vision—ECCV 2020, Glasgow, UK, 23–28 August 2020; Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 478–494. [Google Scholar]

Figure 1. Diagram of the VMOS. (a) Configuration of the VMOS. (b) The schematic of the VMOS adjusting the boresight and FOV of the camera.

Figure 2. The virtual camera pose transformed by specular reflection transformation.

Figure 3. Camera imaging through galvanometer scanner.

Figure 4. The control parameter samples and the virtual cameras’ overlapped FOVs.

Figure 5. The main steps of the VMOS calibration.

Figure 6. Schematic diagram of the calibration process.

Figure 7. Principle of multi-ocular reconstruction.

Figure 8. Correspondence of spatial point and line in the VMOS pose estimation.

Figure 9. The planar calibration target used in the experiments.

Figure 10. The experiment of galvanometer scanner repeatability. (a) The schematic diagram of the galvanometer repeatability experiment. (b) The scene of the repeatability experiment.

Figure 11. The coded points on the calibration target in six positions in G-CS.

Figure 12. The distribution of partial virtual camera boresights.

Figure 13. Scene of the scale bar reconstruction experiment. (a) The experiment scene. (b) Close-up of the scale bar.

Figure 14. The points on the workpiece reconstructed by the VMOS and measured by TriTop^®.

Figure 15. Scene of the 3D object reconstruction experiment. (a) The experiment scene. (b) The 3D object with marker points.

Figure 16. The points on the 3D structure reconstructed by the VMOS by TriTop^®.

Figure 17. Reconstruction results of the repeatability verification experiment.

Figure 18. Scene of the pose estimation experiment using the VMOS.

Figure 19. Pose estimation using the VMOS. (a) The virtual cameras participating in the pose estimation and the photos taken by the virtual cameras. (b) The lines and object point correspondences.

Figure 20. Pose estimation experiment using a single ordinary camera. (a) The ordinary camera system hardware setup in the experiment. (b) Scene of the pose estimation experiment using a single camera.

Table 1. The pixel errors of coded points captured by the VMOS at the same galvanometer deflection.

	MA (Pixel)	RMS (Pixel)
Horizontal	0.041	0.034
Vertical	0.043	0.035
Overall	0.067	0.038

Table 2. The reprojection error under the initial parameters and optimized parameters.

Direction	Initial		Optimized
Direction	MA (Pixel)	RMS (Pixel)	MA (Pixel)	RMS (Pixel)
Horizontal	0.206	0.143	0.200	0.132
Vertical	0.176	0.129	0.169	0.110
Overall	0.296	0.152	0.282	0.137

Table 3. The distance errors of the reconstructed scale bar.

	AM (mm)	RMS (mm)
Distance errors	1.007	0.835

Table 4. Reconstruction errors of 19*8 coded points on the 3D object.

Direction	MA (mm)	RMS (mm)
x	0.153	0.142
y	0.181	0.088
z	1.370	0.776
Full	1.404	0.769

Table 5. Distance errors in each pair of aligned sets of points.

	AM (mm)	RMS (mm)
Distance errors	0.913	0.616

Table 6. Comparison of pose estimation errors between the VMOS and the ordinary camera.

6D Pose Parameters		VMOS	Ordinary Camera
Rotation error (°)	x-axis	0.016	0.250
	y-axis	0.046	0.020
	z-axis	0.024	0.126
	Overall	0.055	0.295
Translation error (mm)	x-direction	0.376	0.098
	y-direction	0.081	1.400
	z-direction	0.597	3.249
	Overall	0.710	3.539

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Han, Z.; Zhang, L. A Virtual Multi-Ocular 3D Reconstruction System Using a Galvanometer Scanner and a Camera. Sensors 2023, 23, 3499. https://doi.org/10.3390/s23073499

AMA Style

Han Z, Zhang L. A Virtual Multi-Ocular 3D Reconstruction System Using a Galvanometer Scanner and a Camera. Sensors. 2023; 23(7):3499. https://doi.org/10.3390/s23073499

Chicago/Turabian Style

Han, Zidong, and Liyan Zhang. 2023. "A Virtual Multi-Ocular 3D Reconstruction System Using a Galvanometer Scanner and a Camera" Sensors 23, no. 7: 3499. https://doi.org/10.3390/s23073499

APA Style

Han, Z., & Zhang, L. (2023). A Virtual Multi-Ocular 3D Reconstruction System Using a Galvanometer Scanner and a Camera. Sensors, 23(7), 3499. https://doi.org/10.3390/s23073499

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Virtual Multi-Ocular 3D Reconstruction System Using a Galvanometer Scanner and a Camera

Abstract

1. Introduction

2. Methodology

2.1. Configuration and Construction of the VMOS

2.2. Calibration Method of the VMOS

2.3. The 3D Reconstruction Method with the VMOS

2.4. Pose Estimation Method Using the VMOS

3. Experiments

3.1. Hardware Setup

3.2. Calibration Experiment

3.2.1. Galvanometer Repeatability Verification

3.2.2. VMOS Calibration

3.3. Experiments on 3D Coordinate Reconstruction

3.3.1. Reconstruction of a Visual Scale Bar

3.3.2. Reconstruction of Marker Points on 3D Structure

3.3.3. Repeatability of Reconstruction Verification

3.4. Experiment on Pose Estimation

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI