DOM has been one of the main products of photogrammetry and remote sensing due to its advantages of high accuracy, rich information, and intuitive expression, and therefore plays an important role in many fields such as flood monitoring, coastline protection, disaster prevention and relief, and urban planning [
1,
2,
3,
4,
5,
6]. Generation of DOM in traditional photogrammetry is based on the perspective projection of ordinary cameras (such as frame cameras), but the field-of-view (FOV) of such types of cameras is generally limited between 40°–50°, resulting in a limited area by a single image, which often requires multiple photographs for a large area. Fisheye lens cameras have a wide FOV (close to or even more than 180°), which are widely used in visual navigation [
7,
8,
9], target detection [
8,
9,
10,
11,
12], environmental surveillance [
13,
14], and forest monitoring [
15,
16]. The use of fisheye cameras instead of ordinary cameras in photogrammetry can expand the single image coverage and greatly improve work efficiency. However, the fisheye lens has a short focal length and complex structure, and the fisheye image has serious nonlinear distortions, which cannot meet the needs for accurate measurement. Moreover, the fisheye camera follows the law of spherical projection, which is different from the imaging process of a perspective projection camera, so the differential orthorectification model in traditional photogrammetry cannot be directly applied to the orthorectification of fisheye camera images.
In order to address the above problems, this paper proposes an orthorectification method for fisheye images under an equidistant projection model. This paper is organized as follows:
Section 1 briefly reviews the related work;
Section 2 introduces the proposed fisheye image orthorectification model and its solution method;
Section 3 presents the experimental results as well as the analysis;
Section 4 discusses the potential, critical aspects and limits of the research conducted.
Related Work
The fisheye image contains a rather larger distortion than the traditional perspective projection image does. Therefore, it should be geometrically corrected to remove the geometric deformation before application. To this end, many methods to correct fisheye images have been proposed. At present, the distortion correction methods of fisheye images are mainly divided into two categories: calibration-based correction methods and projection transformation model-based correction methods.
The camera calibration is the process of determining the IOPs and EOPs of the camera as well as the lens distortions [
17,
18]. The camera calibration methods can be divided according to the calibrator into three-dimensional calibration methods, two-dimensional calibration methods, and self-calibration methods.
(1) Three-dimensional calibration method. The method based on the three-dimensional calibration object first selects some control points on the 3D calibration object and calculates the IOPs of the fisheye camera according to the relationship between the 3D coordinates of the control points and their corresponding points on the fisheye image. In 2005, Schwalbe [
19] completed the calibration of the fisheye camera by laying a large number of control points with known 3D coordinates in the calibration room. The control points were distributed in such a way that concentric circles were formed in the fisheye image plane. The distances between adjacent control points on each concentric circle were equal through a more rigorous solution. In 2014, Ahmadet et al. [
20] used an equidistant projection model to calibrate a fisheye camera based on the relationship between the 3D position of a spherical object with a known radius and its curve in the fisheye image. In the same year, Tommaselli et al. [
21] constructed a 3D terrestrial field which is composed of 139 ARUCO-coded targets. The camera was calibrated in a terrestrial test field using a conventional bundle adjustment with the collinearity and mathematical models specially designed for fisheye lenses. In 2016, Sahin [
22] calibrated an “Olloclip 3 in one” fisheye lens based on an equidistant projection model with 112 control points on a 150 cm diameter antenna as a 3D calibration field. In the same year, Urquhart et al. [
23] proposed a fixed daytime sky imaging camera model and its associated automatic calibration model using a 180° fisheye camera to photograph the sky. The sun in the image plane position change provides a simple and automated calibration method, which has a high time cost.
(2) Two-dimensional calibration method. All the marks of the planar calibration board are on the same plane, which is quick to make and easy to keep, so the two-dimensional plane calibration method is simpler than the three-dimensional calibration method. In 2000, Zhang [
24] proposed a planar template two-step method to calibrate the camera, using the camera to take template pictures from different angles in different directions, establishing the correspondence between 3D coordinate points and 2D planar points based on the corner points on the template, and solving the intrinsic parameters of the camera. However, the calibration method is only suitable for lenses with small distortions and is not suitable for fisheye cameras with large FOV. In 2006, Kannala and Brandt [
25] proposed a generalized calibration model for both fisheye and conventional cameras, which can be calibrated using only a fisheye image of the calibration plate, mainly by finding a minimum projection error to calculate the most suitable result as a parameter. In 2012, Kanatani [
26] proposed a calibration method for ultra-wide fisheye cameras based on the minimization of eigenvalues according to the three constraints of collinearity, parallelism, and orthogonality on a 2D plane plate. In 2013, Arfaoui and Thibault [
27] calibrated the fisheye camera using a virtual grid, generating an accurate virtual calibration grid and calibrating the fisheye camera by rotating the camera around two axes. In 2016, Zhu et al. [
28] proposed a method for estimating the EOPs of a fisheye camera based on 2D cone programming in convex optimization. In 2021, Ling et al. [
29] proposed a structured light-based calibration method for mobile omnidirectional cameras based on the constraint relationship between the vanishing points in the fisheye image and the intrinsic parameters of the imaging model.
(3) Self-calibration method. In this method, the camera can be calibrated only by the relationship between corresponding points in multiple fisheye images without calibration. In 2007, Hartley and Kang [
30] proposed a method to simultaneously calibrate the radial distortion and other intrinsic calibration parameters based on the calibration method by [
24]. This method determines the radial distortion in a parameter-less manner and is independent of any particular radial distortion model, which makes it suitable for cameras with small FOV to wide-angle lenses. In 2009, Schneider et al. [
31] self-calibrated four representative projection models (equal stereographic, equidistant, orthographic, and equisolid angle projection models) and found that distortion correction could be performed by adjusting the lens distortion parameters when using non-actual projection models. However, the method is inaccurate in the calculating the coordinates of the image points and the focal length. In 2010, Hughes et al. [
32] compared the self-calibration accuracy of equal stereographic, equidistant, orthographic, equisolid angle, FET, PFET, FOV, and division projection models. In 2017, Perfetti et al. [
33] used the PFET lens distortion model to self-calibrate the fisheye lens camera. In 2019, Choi et al. [
34] proposed an accurate self-calibration method for fisheye lens cameras for v-type test objects. The RMSE in this method is less than 1 pixel, but it is not effective in camera lens distortion analysis. In the same year, Kakani et al. [
35] proposed a self-calibration method that can be applied for multiple larger FOV camera models.
Generally speaking, calibration-based correction methods require complex mathematical models for the projection process of the fisheye lens, with high correction accuracy, but require special calibration equipment and complex software algorithms. The correction results are the correspondence to the mathematical model, and there is no intuitive and significant improvement in the observation of human vision.
The correction methods based on the projection transformation model are based on a simplified projection model to approximate the complex optical imaging principle of the fisheye lens. Although the correction accuracy is not as good as the calibration-based method, the correction principle is simple and easy to implement, and the visual effect is improved significantly. However, the correction using the spherical projection model will have the problem of losing the object scene around the fisheye image. Among them, the correction methods based on the projection transformation model generally use the columnar model [
36,
37], the spherical perspective projection model [
38], the latitude and longitude model, and the double longitude model [
39,
40]. In recent years, with the rapid development of deep learning, more and more researchers have tried to use convolutional neural networks to correct fisheye images due to their superb visual feature expressiveness [
41,
42,
43,
44].
The above papers only calibrated and corrected the fisheye camera images to meet the human visual habits, without further correcting them to ortho-images to meet the accurate measurement needs. In order to address the above problems, this paper firstly constructs a fisheye image orthorectification model and a fisheye camera calibration model; secondly, it establishes high-precision 3D calibration fields to calculate the IOPs, EOPs, and lens distortion parameters; Finally, the digital elevation model (DEM) is introduced, and the original fisheye image is orthorectified using the method proposed in this paper. Experiments show that the fisheye image can be quickly and accurately corrected to the orthorectified image by the model proposed in this paper.