Next Article in Journal
Privacy-Preserving Multi-User Graph Intersection Scheme for Wireless Communications in Cloud-Assisted Internet of Things
Previous Article in Journal
A Circularly Polarized Broadband Composite Spiral Antenna for Ground Penetrating Radar
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Perspective Distortion Correction Method for Planar Imaging Based on Homography Mapping

Key Laboratory of Mechanism Theory and Equipment Design of State Ministry of Education, Tianjin University, Tianjin 300350, China
*
Author to whom correspondence should be addressed.
Sensors 2025, 25(6), 1891; https://doi.org/10.3390/s25061891
Submission received: 18 February 2025 / Revised: 8 March 2025 / Accepted: 17 March 2025 / Published: 18 March 2025
(This article belongs to the Section Optical Sensors)

Abstract

:
In monocular vision measurement, a barrier to implementation is the perspective distortion caused by manufacturing errors in the imaging chip and non-parallelism between the measurement plane and its image, which seriously affects the accuracy of pixel equivalent and measurement results. This paper proposed a perspective distortion correction method for planar imaging based on homography mapping. Factors causing perspective distortion from the camera’s intrinsic and extrinsic parameters were analyzed, followed by constructing a perspective transformation model. Then, a corrected imaging plane was constructed, and the model was further calibrated by utilizing the homography between the measurement plane, the actual imaging plane, and the corrected imaging plane. The nonlinear and perspective distortions were simultaneously corrected by transforming the original image to the corrected imaging plane. The experiment measuring the radius, length, angle, and area of a designed pattern shows that the root mean square errors will be 0.016 mm, 0.052 mm, 0.16°, and 0.68 mm2, and the standard deviations will be 0.016 mm, 0.045 mm, 0.033° and 0.65 mm2, respectively. The proposed method can effectively solve the problem of high-precision planar measurement under perspective distortion.

1. Introduction

Vision measurement technology has played essential roles in several industrial sectors, particularly in intelligent manufacturing systems [1,2]. This technology leverages images for non-contact measurement, which circumvents the limitations of contact measurement with traditional measuring tools. It offers several advantages, including high precision, enhanced efficiency, non-scratch workpieces, and real-time tracking of the measurement results.
When measuring 2D parameters such as radius, length, angle, and area of a target, a monocular vision system is typically employed. In this system, the target is placed on the measurement plane, and its image is captured by a camera installed vertically above the plane. The pixel equivalent method is then applied to measure the target’s 2D parameters through calibrating the actual physical size corresponding to a single pixel [3,4,5,6]. The maximum accuracy that the system can achieve is an important factor limiting its engineering application. To improve the accuracy, two primary factors must be considered. First, the image edge detection algorithm should extract the target’s edge accurately. Second, the captured image has minimal perspective distortion, which requires a finely manufactured imaging chip and a perfect parallelism between the measurement plane and its image [7,8,9]. If the manufacturing or the parallelism is poor, the measurement plane will produce perspective distortion in imaging, leading to an inaccurate pixel equivalent and ultimately affecting the measurement accuracy.
The existing solutions for the perspective distortion problem can be divided into two categories: hardware and algorithm.
At the hardware level, the perspective distortion can be minimized by carefully adjusting the camera’s position relative to the measurement plane [10]. However, the captured images may still have a little perspective distortion. In some application scenes, the camera may need to be tilted to capture the measurement plane. Another approach involves using a high-cost telecentric lens instead of a standard optical lens [11,12]. However, the telecentric lens must cover the entire measurement plane during application, making it unapplicable for the measurement with a large field of view. In addition, although the telecentric lens provides consistent magnification in depth direction, it also leads to oblique projection. As depicted in Figure 1, if a checkboard is tilted relative to the camera, the aspect ratio of the board will be altered in telecentric lens imaging. The third approach is adding additional sensors [13,14,15]. Liu S, Ge Y, Wang S et al. used a monocular camera with structured light to measure the center distance of planar holes, with a measurement accuracy of 0.1 mm [13]. Chen L, Zhong G, Han Z et al. used a binocular camera to measure the size of planar rectangular workpieces, achieving a measurement accuracy of 0.02 mm [14]. Although this approach effectively avoids perspective errors, adding additional sensors undoubtedly increases both the difficulty of calibration and the overall cost. Consequently, solving the perspective distortion problem from the hardware level either increases costs or cannot be fully solved.
At the algorithmic level, one approach is using the coordinate transformation method [16,17,18]. The target’s world coordinates are computed through the calibrated camera imaging model in the method. Zhang X and Yin H measured the cable cross-sectional radius by aligning a checkboard with the cable cross-section during measurement, with a measurement accuracy of 0.768 mm [16]. Similarly, Miao J, Tan Q, Liu S et al. used a checkboard aligned to the gear end face to calibrate the camera imaging model and measured the gear’s pitch error within 0.06 mm [17]. Although this method can effectively avoid the errors caused by perspective distortion, it is inefficient as it requires calculating world coordinates point by point. Consequently, it is only suitable for measuring simple and regular shapes, not complex and irregular shapes, especially the area of irregular planar shapes. Another approach is correcting perspective distortion by calibrating the conventional perspective transformation model, including calibration based on vanishing points or control point pairs. The former calibrates the model by detecting the vanishing points in images [19,20,21]. Lin J and Peng J corrected the perspective distortion of railroad track images using vanishing points and further realized the tracks’ visual detection [20]. However, the vanishing points are susceptible to camera nonlinear distortion and the existence of parallel lines, resulting in lower correction accuracy. The method is generally used for perspective correction of buildings, roads, and text. The latter calibrates the model by setting the control point pairs in images [22,23]. Wang Q, Zhou Q, Jing G et al. corrected the perspective distortion of circular saw core images by utilizing a checkboard to set control point pairs [22]. The precise position of the circular saw core was then obtained through calibrated pixel equivalent. However, the ideal control points corresponding to the distorted points in this approach are set manually, which can easily create a zoom effect on the whole image, affecting the pixel equivalent accuracy while causing the corrected image to lose the distance scale of the measurement plane relative to the camera. Additionally, the correction accuracy relies heavily on the precision of the manually set ideal control points, i.e., the corrected image will still have a large perspective distortion if the ideal control points are set improperly. The last approach is correcting perspective distortion by deep learning techniques [24,25]. However, this method requires extensive accurate sample for model training and necessitates retraining when the target changes. Consequently, its training cost significantly exceeds that of conventional methods. Moreover, unlike conventional approaches, its model parameters lack explicit physical meaning, which compromises the interpretability of the system.
The foregoing discussion shows that monocular vision measurement technology lacks a low-cost, automated, and highly accurate method to correct planar perspective distortion. If this method can be proposed, the accuracy of the pixel equivalent method can be further improved, and its application can be further expanded. To achieve this purpose, this paper proposed a perspective distortion correction method for planar imaging based on homography mapping and built an experimental platform to verify its effectiveness. First, factors causing perspective distortion of the measurement plane were analyzed from the camera’s intrinsic and extrinsic parameters, and a perspective transformation model of the image was constructed. A corrected imaging plane was then constructed based on the adjustment of the parameters. The established model was then calibrated by utilizing the homography between the measurement plane, the actual imaging plane, and the corrected imaging plane. The nonlinear and perspective distortions in the original image were simultaneously corrected by transforming the original image to the corrected imaging plane. Finally, the proposed method was verified to effectively correct perspective distortion in planar imaging and maintain higher measurement accuracy and stability compared to existing methods.
The remainder of this paper is organized as follows. The causes of perspective distortion are analyzed in Section 2. The distortion correction method is proposed in Section 3. Experiments evaluating the methods’ performance are presented in Section 4 before conclusions are drawn in Section 5.

2. Perspective Distortion Analysis

Analyzing the causes of perspective distortion is a prerequisite for its correction. This section analyzes the causes from the camera’s intrinsic and extrinsic parameters according to the pinhole imaging model.

2.1. Analysis from Extrinsic Parameters

When it comes to perspective distortion in planar imaging, the first factor that comes to mind is that the camera’s optical axis is not exactly perpendicular to the measurement plane resulting in the non-parallelism between the measurement plane and the camera’s imaging plane. This non-parallelism can be reflected in the transformation of the world coordinate system O w X w Y w Z w to the camera coordinate system O c X c Y c Z c according to the pinhole imaging model. Define O w X w Y w Z w on the measurement plane, where Z w = 0 coincides with the measurement plane, then the transformation can be presented as:
X c Y c Z c 1 = R     t 0 1 X w Y w 0 1
where t = [ t 1 t 2 t 3 ] T is the 3 × 1 translation vector, R = [ r 1 r 2 r 3 ] is the 3 × 3 rotation matrix.
From Equation (1), if the measurement plane is parallel to the imaging plane, the Zc values of the points ( X w , Y w ) on the measurement plane are equal, and at this point, R is an ideal matrix. When the two are not parallel, R becomes non-ideal, so R can be used to quantitatively assess the parallelism between the planes. However, the parameters in R are numerous and unintuitive; thus, R is further decomposed into Euler angles. Assuming that O w X w Y w Z w first rotates the Euler angle γ around the Xc axis, then rotates the Euler angle β around the Yc axis, and finally rotates the Euler angle α around the Zc axis, then R can be decomposed as:
R = cos α sin α 0 sin α cos α 0 0 0 1 cos β 0 sin β 0 1 0 sin β 0 cos β 1 0 0 0 cos γ sin γ 0 sin γ cos γ
According to Equations (1) and (2), the Zc value of each ( X w , Y w ) can be computed as:
Z c = t 3 sin β X w + cos β sin γ Y w
From Equation (3), the Zc value of each ( X w , Y w ) is equal to t3 when β and γ are equal to zero, at which point the measurement plane is completely parallel to the imaging plane. Further, the impact of the three Euler angles on plane imaging is shown in Figure 2.
Through the above analysis, the Euler angles β and γ in extrinsic parameters cause perspective distortion in planar imaging.

2.2. Analysis from Intrinsic Parameters

Even if the measurement plane is completely parallel to the imaging plane, the perspective distortion caused by manufacturing errors in camera imaging chips cannot be ignored. The manufacturing errors can be reflected in the camera’s intrinsic matrix K:
K = f x s u 0 0 f y v 0 0 0 1
where ( u 0 , v 0 ) represents the pixel coordinate of the principal point, s denotes the oblique factor of the imaging chip, f x and f y represent the equivalent focal lengths in x and y directions, respectively.
Meanwhile, f x and f y can be computed as:
f x = F / p x f y = F / p y
where F represents the physical focal length of the camera, p x and p y denote the length and height of the pixel unit, respectively.
As depicted in Figure 3, the perspective distortion caused by K mainly has two factors. One is that the smallest unit of the imaging chip is not a perfect square, i.e., p x is not equal to p y . This makes the camera’s magnification in x, and y directions inconsistent, causing f x and f y to be unequal. Another is that the imaging chip itself is not exactly perpendicular, existing a dip angle θ, resulting in s = f x cot θ .

3. Proposed Method

3.1. Model Construction

To correct the perspective distortion, a perspective transformation model is constructed to perform perspective transformation on original images:
λ 1 u v 1 = T u v 1
where λ 1 represents the scale factor, ( u , v ) denotes the pixel coordinate of the actual perspective distorted point in the original image, ( u , v ) denotes the pixel coordinate of the undistorted point in the corrected image, and T is a 3 × 3 perspective transformation matrix with 8 degrees of freedom.

3.2. Model Calibration

In traditional calibration methods for T, the detection of ( u , v ) from the original images often neglects nonlinear distortion. Additionally, the corresponding ( u , v ) need to be manually set based on empirical knowledge, which results in low calibration accuracy of T and often introduces a zoom effect on the whole image [22,23]. To address these limitations, the study constructs a nonlinear distortion model and subsequently calibrates T by leveraging the homography relationship between planes.

3.2.1. Calibration of the Actual Homography Matrix

The actual homography matrix reflects the homography relationship between the measurement plane and the actual imaging plane. As depicted in Figure 4, a pre-calibrated monocular camera is used to capture the original image of a checkboard placed on the measurement plane. Then, the world coordinate system O w X w Y w Z w is established on the checkboard with its origin O w at the upper-left corner and its plane Z w = 0 coinciding with the measurement plane. According to the pinhole imaging model, the mapping relationship from the checkboard world points ( X w , Y w ) to its corresponding ( u , v ) on the actual imaging plane can be represented as [26]:
λ 2 u v 1 = K r 1 r 2 t X w Y w 1
where λ 2 represents the scale factor.
According to the theory of projective transformation, there exists a homography mapping relationship between corresponding coordinate points on the measurement plane and the actual imaging plane. This relationship can be mathematically constructed as a 3 × 3 matrix Hreal:
λ 2 u v 1 = H r e a l X w Y w 1
Observing Equation (7), both K and [ r 1 r 2 t ] are 3 × 3 matrices, which proves that K [ r 1 r 2 t ] is also a 3 × 3 matrix and reflects the homography mapping relationship between ( X w , Y w ) and ( u , v ) . So Hreal can be further expressed as:
H r e a l = h 1 h 2 h 3 = h 11 h 12 h 13 h 21 h 22 h 23 h 31 h 32 1 = K r 1 r 2 t
The actual homography matrix Hreal has 8 degrees of freedom. Hence, it can be calibrated using more than 4 pairs of ( X w , Y w ) and its corresponding ( u , v ) .
However, nonlinear distortion is inevitable due to manufacturing errors in the camera’s optical system [26]. The coordinates of the checkboard corners ( u d , v d ) detected from the original image need to correct nonlinear distortion to obtain ( u , v ) . For the pre-calibrated camera, its radial distortion parameters are named as k 1 , k 2 , its tangential distortion parameters are named as p 1 , p 2 , then the distortion model can be expressed as:
x d y d = ( 1 + k 1 r 2 + k 2 r 4 ) x y + 2 p 1 x + p 2 ( r 2 + 2 x 2 ) p 1 ( r 2 + 2 y 2 ) + 2 p 2 y
where r = x 2 + y 2 and ( x d , y d ) represent the coordinate containing nonlinear distortion on the normalized plane, and its transformation with ( u d , v d ) is as follows:
x d = ( u d u 0 s y d ) / f x y d = ( v d v 0 ) / f y
where ( x , y ) represents the coordinate without nonlinear distortion on the normalized plane, and its transformation with ( u , v ) is as follows:
x = ( u u 0 s y ) / f x y = ( v v 0 ) / f y
By substituting ( u d , v d ) into Equations (10)–(12), ( u , v ) can be obtained, and further Hreal can be calibrated.

3.2.2. Generation of the Corrected Homography Matrix

Based on the analysis in Section 2, a virtual camera is created by adjusting the actual camera’s intrinsic and extrinsic parameters. The imaging plane of the virtual camera is defined as the corrected imaging plane, which is completely parallel to the measurement plane.
  • Intrinsic parameter adjustment:
Intrinsic parameters adjustment is to obtain the virtual camera’s intrinsic matrix K′ based on K. The oblique factor of K′ is set to zero. F remains the same but the single pixel unit’s length and height are set to p x p y . Combine Equation (5), the equivalent focal lengths f x , f y , of K′ can be computed as:
f x = f y = F / p x p y = f x f y
Thus, K′ can be expressed as:
K = f x f y 0 u 0 0 f x f y v 0 0 0 1
  • Extrinsic parameter adjustment:
According to Equations (7) and (8), R and t can be computed by matrix decomposition:
r 1 = ( 1 / λ 3 ) K 1 h 1 r 2 = ( 1 / λ 3 ) K 1 h 2 r 3 = r 1 × r 2 t = ( 1 / λ 3 ) K 1 h 3
where λ 3 represents the scale factor.
Extrinsic parameters adjustment refers to adjusting R by decomposing it into α , β , γ , then only α is preserved, β and γ are set to zero. The adjusted rotation matrix R′ can be expressed as:
R = [ r 1 r 2 r 3 ] = cos α sin α 0 sin α cos α 0 0 0 1
At this point, the virtual camera can be created by setting its intrinsic and extrinsic parameters to K′, R′ and t. Substituting these parameters into Equation (7), The corrected homography matrix Hrect from ( X w , Y w ) to its corresponding ( u , v ) on the corrected imaging plane can be generated as:
λ 4 u v 1 = K [ r 1 r 2 t ] X w Y w 1 = H r e c t X w Y w 1
where λ 4 presents the scale factor.

3.2.3. Calibration

From Equation (8), we can obtain the following:
X w Y w 1 = λ 2 H r e a l 1 u v 1
Then, substitute Equation (18) into Equation (17):
λ 4 u v 1 = λ 2 H r e c t H r e a l 1 u v 1
Combine Equation (19) with Equation (6):
T = H r e c t H r e a l 1
At this point, the calibration of T is realized. The whole homography relationship is shown in Figure 5. After correcting the nonlinear distortion, the perspective distorted image on the actual imaging plane can be transformed into an undistorted image on the corrected imaging plane through Equation (6).
The imaging of the corrected image conforms to the pinhole imaging principle of the virtual camera. We can find the object distance for each ( X w , Y w ) to the virtual camera is t 3 through Equation (3). Meanwhile, the virtual camera’s equivalent focal lengths in x, and y directions are equal to f x f y , implying that the pixel equivalents M in x, and y directions are equal and can be computed as:
M = t 3 / f x f y

3.2.4. Method to Improve Calibration Accuracy

The transformation relationship of the four types of coordinate points in this study is depicted in Figure 6. One readily derives, T is determined by Hreal and Hrect. Meanwhile, Hrect is computed with K and Hreal. Thus, the accuracy of T depends on the calibration accuracy of the camera and Hreal. Assume that the camera has a high calibration accuracy, to improve the accuracy of T, Hreal is calibrated as follows:
First, rearrange Equations (8) and (9):
u = ( h 11 X w + h 12 Y w + h 13 ) / ( h 31 X w + h 32 Y w + 1 ) v = ( h 21 X w + h 22 Y w + h 23 ) / ( h 31 X w + h 32 Y w + 1 )
Further, transforming Equation (22), we can obtain the following:
X w Y w 1 0 0 0 X w u Y w u 0 0 0 X w Y w 1 X w v Y w v η T = u v
where η = [ h 11 h 12 h 13 h 21 h 22 h 23 h 31 h 32 ] .
A pair of ( X w , Y w ) and its corresponding ( u , v ) can get two linear equations, which means Hreal can be calibrated by 4 pairs of the matching points. But to minimize the impact of image noise and errors in checkerboard corner extraction, this study selects a maximal number of checkboard corners and utilizes the least squares method to calibrate Hreal. Meanwhile, the checkboard used for calibration should have better production accuracy and imaging quality to ensure accuracy.

3.2.5. Algorithm Design

In traditional methods, perspective distortion correction either neglects nonlinear distortion entirely or employs a two-stage image interpolation process, where nonlinear distortion correction is performed prior to perspective correction [19,20,21,22,23]. Both approaches inevitably lead to a degradation in the accuracy of the corrected images.
To ensure accuracy, this study integrates T with the nonlinear distortion model, enabling simultaneous correction of both perspective and nonlinear distortions through a single image interpolation process. The proposed distortion correction algorithm primarily consists of forward mapping and backward mapping, as depicted in Figure 7.
Initially, the forward mapping is employed to determine the size of the corrected image. Assume that the width and height of the original image be w d and h d , respectively. The four corner coordinates of the original image can be expressed as (0,0), ( 0 , h d ) , ( w d , 0 ) , ( w d , h d ) , then substitute them as ( u d , v d ) into Equations (10)–(12) to obtain four intermediate coordinates corresponding to ( u , v ) . Afterward, the four intermediate coordinates are substituted into Equation (6) as ( u , v ) to obtain four new coordinates corresponding to ( u , v ) : ( x 1 , y 1 ) , ( x 2 , y 2 ) , ( x 3 , y 3 ) , ( x 4 , y 4 ) . Finally, the corrected image’s width w and height h can be computed as:
w = c e i l max ( x 1 , x 2 , x 3 , x 4 ) min ( x 1 , x 2 , x 3 , x 4 ) + 1 h = c e i l max ( y 1 , y 2 , y 3 , y 4 ) min ( y 1 , y 2 , y 3 , y 4 ) + 1
where ceil() is the roundup function.
Furthermore, the reverse mapping is used to obtain the gray value of the corrected image. Begin by creating a blank image of the corrected image with a width of w and a height of h . For each pixel ( u , v ) in the blank image, compute its corresponding ( u , v ) using Equation (6). Then, apply Equations (10)–(12) to compute ( u d , v d ) corresponding to ( u , v ) . Afterward, the gray value at ( u d , v d ) is obtained by the bicubic interpolation method and is assigned to ( u , v ) . After traversing all ( u , v ) , the corrected image can be output.

4. Experiment

4.1. Experiment Design

To verify the effectiveness of the proposed method, the designed visual measurement platform is shown in Figure 8, which includes a PC, industrial camera, machine vision platform, ring light, adjustment block, and test board. The camera consists of an MER-500-14U3M CMOS (Daheng Imaging, Beijing, China) and a M1214-MP2 12 mm FA fixed focal lens (Computar, Tokyo, Japan). The specific physical parameters of the camera are shown in Table 1. Using the proposed method, the image processing and measurement experiments are implemented in MATLAB2020a on the PC running a 64-bit Windows 10 system with 2.5 GHz CPU and 8 GB RAM.
The test board is made of a 1 mm thick flat matte ceramic plate. Its pattern is shown in Figure 9, and the overall production accuracy of the pattern is ±1 µm. Each checkerboard square in the pattern is 5 mm in length, which is used to calibrate the perspective transformation model. Additionally, to avoid measurement errors caused by misalignment between the measurement plane and the calibration plane [8], four standard test patterns, coplanar with the checkboard pattern, are set on the test board to test the measurement performance of the proposed method. The circle pattern has a radius of 16 mm (labeled R16) and is used to evaluate radius measurement performance. The rectangle pattern has a length of 36 mm and a height of 22 mm (labeled L36 and H22, respectively) and is used to evaluate length measurement performance. The right triangle pattern has two acute angles of 37° and 53° (labeled D37 and D53, respectively) and is used to evaluate angle measurement performance. The ellipse pattern is viewed as an irregular planar shape with an area of 622.035 mm2 (labeled Area) and is used to evaluate area measurement performance.
Step 1: Camera calibration. A camera calibration board is placed on the platform, and multiple images of the board are captured by changing its poses relative to the camera. Then, the camera is calibrated by Zhang’s method.
Step 2: Image acquisition. The test board is placed on the platform. The position of the adjustment block beneath the test board is randomly adjusted to obtain images of the test board with different degrees of perspective distortion. To verify the repeatability of the proposed method, fifty original images of the test board in various poses are acquired. Specifically, half are captured during the day and the other half at night. Note that the test board should remain within the camera’s field of view and depth of field throughout the entire capture process. Meanwhile, the test board is well illuminated during the experiments.
Step 3: Image processing. The pixel coordinates ( u d , v d ) in the original images are obtained by checkboard corner detection. The T of each original image is then calibrated by the proposed method. Further, the corrected images are obtained by the designed correction algorithm.
Step 4: Experimental verification. To demonstrate the effectiveness of the method, four experiments are set up: Exp.1 is to verify its accuracy by calculating the reprojection error of T; Exp.2 is to evaluate the residual perspective distortion in the corrected images by computing their extrinsic parameters; Exp.3 is to compare the measurement accuracy with the existing methods by measuring R16, L36, H22, D37, D53, and Area; Exp.4 is to investigate the impact of camera calibration errors on the proposed method.

4.2. Results and Evaluations

Using the Camera Calibrator App in MATLAB2020a, the camera intrinsic parameters were calibrated through Zhang’s method, and the calibration results are shown in Table 2. The overall reprojection error of camera calibration results obtained from the App is 0.055 pixels.
The computed K′ is shown as follows:
K = 5497.138 0 1292.928 0 5497.138 960.322 0 0 1
Under the experimental conditions, including the specific computer hardware and camera resolution employed, the designed algorithm achieves an average correction time of 1.03 s per image.
To conduct Exp.1 and Exp.2, nine images with different degrees of perspective distortion were selected from the fifty original images. Their extrinsic parameters, α, β, γ and t3, are shown in Table 3.
Of the nine selected images, Pose 4 and Pose 8, which have severe perspective distortion according to β and γ, were selected as examples. The comparison between their original and corrected images is shown in Figure 10.
Exp.1: Compute the reprojection error to verify the accuracy of T. After calibrating T and obtaining the corrected images corresponding to the selected nine original images, the pixel coordinates ( u , v ) of the checkboard corners in each corrected image were re-detected. Their corresponding ( u , v ) were computed by mapping ( u , v ) to T. In the same way as calculating the reprojection error for camera calibration, the reprojection error P e r r of each checkboard corner, as well as the mean reprojection error M P e r r of the entire checkerboard, can be computed by:
P e r r = ( u u ) 2 + ( v v ) 2
M P e r r = 1 m i = 1 m P e r r
The calculation results of P e r r and M P e r r for each pose are plotted as a heatmap shown in Figure 11, where the coordinates of each small square correspond to the world coordinates of each corner of the checkboard, and each square’s color represents the value of P e r r at that position.
From the results, it can be concluded that T can achieve high accuracy when the camera is well calibrated. Analyzed from the M P e r r , the M P e r r of all poses are slightly below the camera calibration’s mean reprojection error of 0.055 pixels. This indicates that the accuracy of the proposed method is closely related to the camera calibration accuracy. Analyzed from the P e r r , the majority of P e r r are below 0.05 pixels, especially in the central region (6 × 7 = 42 checkboard corners) where no significant outliers are observed, indicating that the method has the best correction accuracy in the center region of the image. Contrasted with this are the four edge regions where the P e r r of some checkerboard corners are occasionally close to 0.1 pixels. This can be caused by image noise or errors in nonlinear distortion model.
Exp.2: Evaluate the residual perspective distortion in the corrected images. The homography matrix H r e a l corresponding to ( u , v ) and ( X w , Y w ) was computed by the study’s method. Further, the extrinsic parameters of the measurement plane relative to the corrected imaging plane, i.e., the Euler angles α , β , γ and the depth value t 3 , were computed by substituting K and H r e a l to Equations (2) and (15).
The corrected imaging plane constructed in this study is completely parallel to the measurement plane, so the value of β and γ should be close to zero. At the same time, the α and t 3 of the actual imaging plane relative to the measurement plane are preserved in the corrected images. So the α error α e r r and the depth error D e r r , defined as follows, should be close to zero.
α e r r = α α
D e r r = t 3 t 3
The computed extrinsic parameters of the corrected images are shown in Table 4. The computed evaluation indicators: β , γ , α e r r and D e r r are shown in Figure 12.
From the results, it can be seen both β and γ of all poses are less than 0.014°, and especially in some poses, their values are less than 0.005°, which proves that the corrected imaging plane is almost completely parallel to the measurement plane. Hence, the proposed method can effectively correct the perspective distortion. For all poses, α e r r are less than 0.001° and D e r r are within 0.016 mm, which proves that the proposed method can effectively preserve the original rotation angle and depth value of the measurement plane relative to the actual imaging plane along the optical axis direction.
Exp.3: Compare the measurement accuracy with the existing methods. The pixel equivalents for each of the fifty corrected images were calibrated from Equation (21). The measurement values of R16, L36, H22, D37, D53, and Area were computed by the calibrated pixel equivalents. Then, their corresponding measurement errors are obtained by subtracting the true values from the measurement values.
For comparison purposes, two of the most widely used methods were implemented. One is to measure these parameters by calibrating the homography matrix H r e a l used in Ref. [17] (labeled Miao2020). The other is to measure these parameters through image perspective transformation used in Ref. [22] (labeled Wang2023), in which T is calibrated by selecting ( u , v ) and manually setting their corresponding ( u , v ) as control point pairs.
Note that the three methods are identical in image processing and parameter calculation. After Canny edge detection to the corrected images, R16 is obtained by fitting the circle using the least square method; L36 and H22 are obtained by calculating the distance between vertices after fitting the lines with the least square method; D37 and D53 are obtained through the fitted lines’ slope. Because the ellipse pattern is viewed as an irregular shape, Area is obtained by counting the number of pixels after binarizing the corrected images using the OTSU method, so Miao2020 cannot be applied to this situation.
To visualize the distribution of the results, the measurement errors derived from the three methods are plotted into standard box plots, as shown in Figure 13.
From the results, the errors of the proposed method are within [−0.03 mm, 0.03 mm] when measuring R16, which is more centralized than the other two methods. In measuring L36 and H22, the errors of the proposed method are within [−0.08 mm, 0.07 mm] and [−0.14 mm, 0.08 mm], respectively, which are more centralized than Miao2020. Although Wang2023 has the best concentration, several outliers appeared. In measuring D37 and D53, the errors of the proposed method are within [0.09°, 0.19°] and [−0.22°, −0.09°], respectively, which are better distributed than Wang2023 and have similar performance to Miao2020. When measuring Area, the accuracy of the proposed method is within [−1.3 mm2, 1 mm2], which is superior to Wang2023 in terms of measurement accuracy and distribution.
To further evaluate the measurement performance of the three methods, the root mean square errors (RMSEs) and the standard deviations (SDs) of the measurement errors were computed to compare measurement accuracy and stability, respectively. The formulas for RMSE and SD are defined as follows:
R M S E = 1 50 i = 1 50 ( X r e a l X i ) 2
S D = 1 50 i = 1 50 ( X i X ¯ i ) 2
where X r e a l represents the true value, X i represents the measurement value, and X ¯ i represents the mean value of the measurement values.
The RMSE of the three methods is shown in Figure 14, and the SD is shown in Figure 15.
From the results, the RMSE of R16, L36, H22, D37, D53, and Area in the proposed method are within 0.016 mm, 0.052 mm, 0.050 mm, 0.14°, 0.16°, and 0.68 mm2, and the SD are within 0.016 mm, 0.043 mm, 0.045 mm, 0.025°, 0.033°, and 0.65 mm2, respectively. The proposed method maintains higher accuracy and stability compared to existing methods and even outperforms existing methods in some cases. In particular, when measuring the Area, the RMSE of the proposed method is 32% lower than Wang2023, and the SD is 34% lower than Wang2023. In addition, the proposed method does not need to manually set ( u , v ) as in Wang2023, which avoids the image scaling problem caused by improperly setting the control point pairs.
Overall, the proposed method can effectively improve the accuracy of the pixel equivalent method; thus, it can be applied to realize the high-precision planar measurement under perspective distortion.
Exp. 4: Investigate the impact of camera calibration errors on the proposed method. Pose 1 was selected as the experimental subject, with the camera calibration results from the experiment serving as the ground truth. During the calibration process of T, different error amounts ( Δ f x , Δ f y , Δ s , Δ β , Δ γ ) were added to parameters f x , f y , s , β , and γ , respectively, to perform calibration and generate corresponding corrected images. Subsequently, measurements of L36 and H22 were extracted from these corrected images to evaluate the sensitivity of the proposed method to camera calibration errors. The experimental results are illustrated in Figure 16.
According to Equation (21), the increase in errors of f x and f y would reduce M , thereby enlarging the measurement errors of L36 and H22. This is effectively proved by experimental results. Specifically, the error of f x mainly affects the measurement result of L36, and the error of f y mainly affects the measurement result of H22. The error in s has a relatively minor impact on measurement results, but its influence intensifies with increasing pixel length. The error in β primarily causes scaling along the x-direction, leading to reduced pixel length of L36 and consequently larger measurement errors. Similarly, the error in γ mainly induces scaling along the y-direction, resulting in shortened pixel length of H22 and amplified measurement errors. Overall, the proposed method can achieve satisfactory measurement performance when camera calibration errors remain within acceptable limits.
Additionally, β and γ were further calculated from the corrected images to assess the planar tilt residuals caused by camera calibration errors. The experimental results are presented in Figure 17.
From the results, it can be seen that among the intrinsic parameters, errors in f x primarily induce β , errors in f y mainly cause γ , and errors in s simultaneously contribute to both β and γ . But relatively speaking, the impact of intrinsic parameter errors on tilt residuals is relatively small. In contrast, within the extrinsic parameters, β is almost equivalent to Δ β and γ nearly equals Δ γ . Thus, it can be concluded that when the camera is well calibrated, extrinsic parameter errors become the dominant factor causing tilt residuals.

5. Conclusions

In monocular vision measurement, a barrier to implementation is the perspective distortion caused by manufacturing errors in the imaging chip and non-parallelism between the measurement plane and its image, which makes it challenging to improve the accuracy of pixel equivalent and measurement results. To address this issue, the paper proposed a perspective distortion correction method for planar imaging based on homography mapping.
This method overcomes the limitations of traditional approaches that require the manual setting of ideal points for calibrating the perspective transformation model. Instead, it achieves calibration solely through the homography relationship between planes. Furthermore, the proposed method integrates the perspective transformation model with the nonlinear distortion model, enabling simultaneous correction of both perspective and nonlinear distortions through a single image interpolation process.
In experiments, the proposed method demonstrated high accuracy with a mean reprojection error of less than 0.05 pixels. It effectively corrects distortions while preserving the original rotation angle and depth value of the measurement plane relative to the actual imaging plane along the optical axis. In measuring the radius, length, angle, and area of the designed pattern, the RMSE of the proposed method are within 0.016 mm, 0.052 mm, 0.16°, and 0.68 mm2, with the SD of 0.016 mm, 0.045 mm, 0.033°, and 0.65 mm2, respectively. Compared to the existing methods, the proposed method exhibited lower RMSE and SD (specifically 32% and 34% lower when measuring area, respectively), proving the proposed method has higher accuracy and stability. The proposed method can effectively improve the accuracy of the pixel equivalent method, thus realizing high-precision planar measurement under perspective distortion.
Based on current research, future work will focus on two aspects: firstly, integrating the proposed method with downstream edge detection algorithms to achieve measurement of complex targets in industrial scenarios, and secondly, optimizing algorithm design to meet the requirements of large field-of-view and high real-time performance in industrial visual measurement systems.

Author Contributions

Conceptualization, C.W., Y.D. and J.M.; methodology, C.W. and K.C.; software, C.W. and J.L.; validation, C.W. and K.C.; formal analysis, C.W.; investigation, C.W.; resources, C.W. and Q.X.; data curation, C.W. and K.C.; writing—original draft preparation, C.W.; writing—review and editing, C.W., J.M., J.L. and Q.X.; visualization, C.W. and J.L.; supervision, Y.D. and J.M.; project administration, Y.D.; funding acquisition, Y.D. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Natural Science Foundation of China, grant number 52375507.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Dataset available on request from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Yang, Y.; Zhang, C.; Hu, C. Research on Aeroengine Detection Technology Based on Machine Vision; SPIE: Beijing, China, 2023; Volume 12963. [Google Scholar]
  2. Javaid, M.; Haleem, A.; Singh, R.P.; Rab, S.; Suman, R. Exploring impact and features of machine vision for progressive industry 4.0 culture. Sens. Int. 2022, 3, 100132. [Google Scholar] [CrossRef]
  3. Nogueira, V.V.E.; Barca, L.F.; Pimenta, T.C. A Cost-Effective Method for Automatically Measuring Mechanical Parts Using Monocular Machine Vision. Sensors 2023, 23, 5994. [Google Scholar] [CrossRef] [PubMed]
  4. Fang, Y.; Wang, X.; Xin, Y.; Luo, Y. Sub-pixel dimensional and vision measurement method of eccentricity for annular parts. Appl. Opt. 2022, 61, 1531–1538. [Google Scholar] [CrossRef]
  5. Fang, S.; Yang, L.; Tang, J.; Guo, W.; Zeng, C.; Shao, P. Visual measurement of lateral relative displacement of wheel-rail of high-speed train under earthquake. Eng. Struct. 2024, 305, 117736. [Google Scholar] [CrossRef]
  6. Wang, X.; Li, F.; Du, Q.; Zhang, Y.; Wang, T.; Fu, G.; Lu, C. Micro-amplitude vibration measurement using vision-based magnification and tracking. Measurement 2023, 208, 112464. [Google Scholar] [CrossRef]
  7. Sun, Y. Analysis for center deviation of circular target under perspective projection. Eng. Comput. 2019, 36, 2403–2413. [Google Scholar] [CrossRef]
  8. Wang, S.; Li, X.; Zhang, Y.; Xu, K. Effects of camera external parameters error on measurement accuracy in monocular vision. Measurement 2024, 229, 114413. [Google Scholar] [CrossRef]
  9. Fang, L.; Shi, Z.L.; Li, C.X.; Liu, Y.P.; Zhao, E.B. Geometric transformation modeling for line-scan images under different camera poses. Opt. Eng. 2022, 61, 10103. [Google Scholar] [CrossRef]
  10. Li, X.; Liu, W.; Pan, Y.; Ma, J.; Wang, F. A Knowledge-Driven Approach for 3D High Temporal-Spatial Measurement of an Arbitrary Contouring Error of CNC Machine Tools Using Monocular Vision. Sensors 2019, 19, 744. [Google Scholar] [CrossRef]
  11. Liu, F.; Li, J.; Yang, Q.; Gao, P.; Ni, Y.; Wang, L. Monocular spatial geometrical measurement method based on local geometric elements associated with out-of-view datum. Measurement 2023, 214, 112828. [Google Scholar] [CrossRef]
  12. Poyraz, A.G.; Kaçmaz, M.; Gürkan, H.; Dirik, A.E. Sub-Pixel counting based diameter measurement algorithm for industrial Machine vision. Measurement 2024, 225, 114063. [Google Scholar] [CrossRef]
  13. Liu, S.; Ge, Y.; Wang, S.; He, J.; Kou, Y.; Bao, H.; Tan, Q.; Li, N. Vision measuring technology for the position degree of a hole group. Appl. Opt. 2023, 62, 869–879. [Google Scholar] [CrossRef]
  14. Chen, L.; Zhong, G.; Han, Z.; Li, Q.; Wang, Y.; Pan, H. Binocular visual dimension measurement method for rectangular workpiece with a precise stereoscopic matching algorithm. Meas. Sci. Technol. 2022, 34, 035010. [Google Scholar] [CrossRef]
  15. Le, K.; Yuan, Y. Based on the Geometric Characteristics of Binocular Imaging for Yarn Remaining Detection. Sensors 2025, 25, 339. [Google Scholar] [CrossRef] [PubMed]
  16. Zhang, X.; Yin, H. A Monocular Vision-Based Framework for Power Cable Cross-Section Measurement. Energies 2019, 12, 3034. [Google Scholar] [CrossRef]
  17. Miao, J.; Tan, Q.; Liu, S.; Bao, H.; Li, X. Vision measuring method for the involute profile of a gear shaft. Appl. Opt. 2020, 59, 4183–4190. [Google Scholar] [CrossRef]
  18. Wang, L.C.; Dong, J.H.; Cheng, Q.Y.; Shang, Y.J.; Geng, S.Q. Velocity measurement of moving target based on rotating mirror high speed camera. Opt. Eng. 2023, 62, 17. [Google Scholar] [CrossRef]
  19. Santana-Cedrés, D.; Gomez, L.; Alemán-Flores, M.; Salgado, A.; Esclarín, J.; Mazorra, L.; Alvarez, L. Automatic correction of perspective and optical distortions. Comput. Vis. Image Underst. 2017, 161, 1–10. [Google Scholar] [CrossRef]
  20. Lin, J.; Peng, J. Adaptive inverse perspective mapping transformation method for ballasted railway based on differential edge detection and improved perspective mapping model. Digit. Signal Process. 2023, 135, 103944. [Google Scholar] [CrossRef]
  21. Merino-Gracia, C.; Mirmehdi, M.; Sigut, J.; González-Mora, J.L. Fast perspective recovery of text in natural scenes. Image Vis. Comput. 2013, 31, 714–724. [Google Scholar] [CrossRef]
  22. Wang, Q.; Zhou, Q.; Jing, G.; Bai, S. Circular saw core localization in the quenching process using machine vision. Opt. Laser Technol. 2023, 161, 109111. [Google Scholar] [CrossRef]
  23. Zhao, X.; Du, H.; Yu, D. Improving Measurement Accuracy of Deep Hole Measurement Instruments through Perspective Transformation. Sensors 2024, 24, 3158. [Google Scholar] [CrossRef]
  24. Li, X.Y.; Zhang, B.; Liao, J.; Sander, P. Document Rectification and Illumination Correction using a Patch-based CNN. ACM Trans. Graph. (TOG) 2019, 38, 1–11. [Google Scholar] [CrossRef]
  25. Li, Y.; Wright, B.; Hameiri, Z. Deep learning-based perspective distortion correction for outdoor photovoltaic module images. Sol. Energy Mater. Sol. Cells 2024, 277, 113107. [Google Scholar] [CrossRef]
  26. Zhang, Z. A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 1330–1334. [Google Scholar] [CrossRef]
Figure 1. Perspective distortion under different lenses (The gray dash line represents imaging field of view): (a) Actual pattern; (b) imaging under ordinary lens; (c) imaging under telecentric lens.
Figure 1. Perspective distortion under different lenses (The gray dash line represents imaging field of view): (a) Actual pattern; (b) imaging under ordinary lens; (c) imaging under telecentric lens.
Sensors 25 01891 g001
Figure 2. Impact of the three Euler angles on plane imaging (The gray dash line represents imaging field of view).
Figure 2. Impact of the three Euler angles on plane imaging (The gray dash line represents imaging field of view).
Sensors 25 01891 g002
Figure 3. Perspective distortion caused by intrinsic parameters.
Figure 3. Perspective distortion caused by intrinsic parameters.
Sensors 25 01891 g003
Figure 4. Calibration of the actual homography matrix (The purple line represents the transformation relationship, and the blue dash line represents the field of view of the actual camera).
Figure 4. Calibration of the actual homography matrix (The purple line represents the transformation relationship, and the blue dash line represents the field of view of the actual camera).
Sensors 25 01891 g004
Figure 5. Homography relationship between planes (The purple line represents the transformation relationship, the blue dash line represents the field of view of the actual camera, and the red dash line represents the field of view of the ideal camera).
Figure 5. Homography relationship between planes (The purple line represents the transformation relationship, the blue dash line represents the field of view of the actual camera, and the red dash line represents the field of view of the ideal camera).
Sensors 25 01891 g005
Figure 6. Relationship between coordinate points (The purple line represents the transformation relationship).
Figure 6. Relationship between coordinate points (The purple line represents the transformation relationship).
Sensors 25 01891 g006
Figure 7. Process of distortion correction.
Figure 7. Process of distortion correction.
Sensors 25 01891 g007
Figure 8. The visual measurement platform: (1) PC; (2) industrial camera; (3) ring light; (4) machine vision platform; (5) test board; (6) adjustment block.
Figure 8. The visual measurement platform: (1) PC; (2) industrial camera; (3) ring light; (4) machine vision platform; (5) test board; (6) adjustment block.
Sensors 25 01891 g008
Figure 9. Pattern of the test board.
Figure 9. Pattern of the test board.
Sensors 25 01891 g009
Figure 10. Comparison before and after correction.
Figure 10. Comparison before and after correction.
Sensors 25 01891 g010
Figure 11. The calculation results of P e r r and M P e r r for each pose.
Figure 11. The calculation results of P e r r and M P e r r for each pose.
Sensors 25 01891 g011
Figure 12. The calculation results of β , γ , α e r r and D e r r for each pose.
Figure 12. The calculation results of β , γ , α e r r and D e r r for each pose.
Sensors 25 01891 g012
Figure 13. The distribution of measurement errors [17,22].
Figure 13. The distribution of measurement errors [17,22].
Sensors 25 01891 g013
Figure 14. The RMSE of the three methods [17,22].
Figure 14. The RMSE of the three methods [17,22].
Sensors 25 01891 g014
Figure 15. The SD of the three methods [17,22].
Figure 15. The SD of the three methods [17,22].
Sensors 25 01891 g015
Figure 16. The sensitivity of the proposed method to camera calibration errors.
Figure 16. The sensitivity of the proposed method to camera calibration errors.
Sensors 25 01891 g016
Figure 17. The planar tilt residuals caused by camera calibration errors.
Figure 17. The planar tilt residuals caused by camera calibration errors.
Sensors 25 01891 g017
Table 1. Physical parameters of the industrial camera.
Table 1. Physical parameters of the industrial camera.
Parameters NamesParameters Values
Pixel resolution2592 × 1944
Pixel size2.2 μm × 2.2 μm
Size of imaging chip2592 × 2.2 = 5.702 mm
1944 × 2.2 = 4.276 mm
Focal length12 mm
Focus distance520 mm
Image distance1/(1/12 − 1/520) = 12.283 mm
Visual field520 × 5.702/12.283 = 241.394 mm
520 × 4.276/12.283 = 181.024 mm
Pixel precision520 × 2.2/12.283 = 93.137 μm/pixel
Table 2. Calibration results of the camera intrinsic parameters.
Table 2. Calibration results of the camera intrinsic parameters.
Parameters NamesParameters Values
( u 0 , v 0 ) (1262.928, 960.322)
f x 5497.031
f y 5497.245
s 0.041
k 1 −0.077
k 2 0.305
p 1 0.00047
p 2 −0.00011
Table 3. Extrinsic parameters of the original images.
Table 3. Extrinsic parameters of the original images.
Poseαβγt3/mm
1−0.198−16.987−0.316521.447
2−0.897−8.157−0.350523.331
3−0.57913.439−0.240494.681
4−0.11329.971−0.132464.244
51.287−1.827−33.269470.868
6−0.238−1.384−16.685495.230
7−0.364−1.9906.645522.048
8−1.330−1.69930.047512.587
94.6537.897−8.376491.392
Table 4. Extrinsic parameters of the corrected images.
Table 4. Extrinsic parameters of the corrected images.
Poseα′/°β′/°γ′/° t 3 / m m
1−0.1990.002−0.011521.453
2−0.897−0.012−0.008523.340
3−0.5790.0010.005494.695
4−0.1130.012−0.005464.258
51.287−0.001−0.004470.858
6−0.2380.001−0.011495.230
7−0.364−0.010−0.005522.061
8−1.330−0.0080.006512.599
94.653−0.018−0.001491.407
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, C.; Ding, Y.; Cui, K.; Li, J.; Xu, Q.; Mei, J. A Perspective Distortion Correction Method for Planar Imaging Based on Homography Mapping. Sensors 2025, 25, 1891. https://doi.org/10.3390/s25061891

AMA Style

Wang C, Ding Y, Cui K, Li J, Xu Q, Mei J. A Perspective Distortion Correction Method for Planar Imaging Based on Homography Mapping. Sensors. 2025; 25(6):1891. https://doi.org/10.3390/s25061891

Chicago/Turabian Style

Wang, Chen, Yabin Ding, Kai Cui, Jianhui Li, Qingpo Xu, and Jiangping Mei. 2025. "A Perspective Distortion Correction Method for Planar Imaging Based on Homography Mapping" Sensors 25, no. 6: 1891. https://doi.org/10.3390/s25061891

APA Style

Wang, C., Ding, Y., Cui, K., Li, J., Xu, Q., & Mei, J. (2025). A Perspective Distortion Correction Method for Planar Imaging Based on Homography Mapping. Sensors, 25(6), 1891. https://doi.org/10.3390/s25061891

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop