The relationship between the three-dimensional information of any point in space and the corresponding point in the image can be obtained by calibrating the binocular camera. The camera calibration accuracy directly determines the reliability of the binocular ranging results. However, the underwater binocular camera often requires a waterproof layer, which causes light to refract many times during propagation, so the pinhole imaging model used on land is not suitable for underwater imaging. In this study, the underwater imaging model is modified to conform to the pinhole imaging model, and then a laser rangefinder is applied to further correct the camera parameters.
3.1.1. Camera Imaging Model
The pinhole camera model is commonly adopted to represent the imaging principle of cameras. The imaging models of monocular and binocular cameras are the same. As shown in
Figure 4a, the size of the image changes as the pinhole plane moves forward and backward. The pinhole camera model is shown as follows:
where
f represents the focal length of the camera,
Z represents the distance from the hole to the object,
X represents the length of the object, and
x represents the length of the object on the image plane.
For convenience, the transformation process from the world coordinate system to the pixel coordinate system is represented in
Figure 4b. Transformation between the world coordinate system and the camera coordinate system can be achieved through coordinate system rotation and translation, as shown in Equation (
2).
where
R represents the 3 × 3 rotation matrix,
T represents the translation vector, and
and
represent world coordinates and camera coordinates of point
P, respectively. The relationship between the camera coordinates and image coordinates is shown in Equation (
3), which can be expressed in matrix form, as shown in Equation (
4).
where
represent image coordinates of point
P, and
represent the focal length of the camera. The relationship between the image coordinates and pixel coordinates is shown in Equation (
5), which can be expressed in matrix form, as shown in Equation (
6).
where
represent pixel coordinates of point
P, and
represent the pixel coordinates in the center of the image. Finally, the transformation relationship between the world coordinate system and the pixel coordinate system can be derived as follows:
where
and
represent the internal and external parameter matrices of the camera, respectively.
3.1.2. Modified Model for Underwater Imaging
Zhang’s calibration method [
29] is a popular, highly accurate camera calibration method that is often used in land camera calibration. This method is used to calibrate the binocular camera in this study. The appropriate acquisition number of the checkerboard is 18 pairs [
30]. To improve the calibration results, 64 pairs of images are collected on land and underwater. The calibration results for the binocular camera on land and underwater are shown in
Table 1.
In
Table 1, only
and
change greatly between the calibration results on land and those underwater except for
and
, which are radial distortion parameters used to eliminate image distortion.
and
are the physical size of a single pixel, which does not change, so the parameter that changes the most is the focal length. To compare the effects of the land and underwater calibration results on binocular ranging, the land and underwater camera calibration results are used for underwater ranging. The binocular ranging is performed within 0.1 m to 1.3 m. Each distance is measured five times and the standard variance approximates 0. The ranging results are shown in
Figure 5.
In
Figure 5, the calibration results on land are applied to underwater binocular ranging, and the ranging error is very large and grows linearly. The reason is the binocular camera is not affected by light refraction when calibrated on land, but underwater light refraction cannot be ignored. When the underwater calibration results are applied to underwater binocular ranging, the ranging error is clearly lower, but it still increases as the distance increases and also grows linearly. When the distance is 1.3 m, the ranging error reaches 0.1 m.
Ideally, the slopes of these three lines should match the slope of the black line, but in practice, only the slope of the laser rangefinder always approaches the ideal slope. A comparison between the calibration results for land and underwater water reveals that the most important parameter is the focal length. Moreover, in the literature [
31], the largest difference between the camera calibration results after considering the multilayer refraction of underwater light and the calibration results obtained by using Zhang’s calibration method is also the focal length. Therefore, the focal length is the key factor for improving binocular underwater ranging.
To correct the focal length of the camera, the underwater imaging model of the camera is analyzed. The main difference between binocular camera on land and underwater is that binocular cameras must be placed in waterproof covers when used underwater. Light enters the waterproof cover from the water and then enters the camera from the waterproof cover, undergoing two refractions. Therefore, the underwater imaging model no longer conforms to the pinhole camera model. To ensure that the underwater imaging model remains consistent with the pinhole camera model, the underwater imaging model is modified as shown in
Figure 6. The waterproof cover is only 2 mm thick, and the refraction of the waterproof cover to the light can be ignored because the thin waterproof cover does not change the light’s final direction and only induces a negligible radial shift [
32].
In
Figure 6, the light starts from the three-dimensional point
P and is affected by water refraction. The pixel coordinates of point
P on the real image plane are
. If refraction is not considered, the pixel coordinates of point
P on virtual image plane
I are
. The pixel coordinates obtained without considering refraction are clearly inconsistent with the actual pixel coordinates. Therefore, the dotted line
is extended to find the coordinates
that agree with the coordinates
. The current focal length changes and is called the virtual focal length
. Then, the sources of binocular ranging errors are further analyzed. The schematic diagram of the binocular ranging principle is shown as in
Figure 7.
In
Figure 7, the optical centers of the left and right cameras are
and
. The distance between
and
is the baseline, denoted as
b.
and
are the
x values of point
P in the left and right image coordinate systems, respectively.
According to the triangle similarity principle, we can obtain
Then, we further obtain
where
represents the difference between the
x values of point
P in the image coordinate system of the left and right cameras, called the disparity.
Equation (
10) shows that
in world coordinates is directly related to disparity and focal length. The camera calibration is calculated according to the pinhole camera mode and does not take water refraction into account. In
Figure 6, point
P corresponds to
according to this calibration; however, it actually corresponds to
. The incorrect pixel coordinates cause the incorrect image coordinates. The disparity calculated from the image coordinates directly affects the calculation of
. Therefore, the binocular camera still has error after underwater calibration. However, in
Figure 6, considering the effect of refraction to approximate the underwater imaging model as a pinhole model, a virtual focal length
is constructed so that the
P point correctly corresponds to
, which can eliminate the error.
Therefore, the calculation of the unknown
is key. The modified underwater imaging model is shown as
According to Equation (
5),
and
can be further expressed as
where
represent the world coordinates of point
P,
represent the image coordinates corresponding to the pixel coordinates
,
represents the distance difference between the actual lens and the virtual lens, which is very small and can be ignored, and
and
represent the virtual focal lengths in the
x and
y directions, respectively.
is difficult to obtain using a binocular camera. However, it can be obtained more easily by using a laser rangefinder and combining the position relationship between the laser rangefinder and the binocular camera. The detailed derivation of the position relationship is described in
Section 3.3.1. Finally, the virtual focal length can be obtained to correct the camera calibration result.