Next Article in Journal
Study on Mechanical Properties of Simply-Supported Composite Beams Considering Creep and Slip
Next Article in Special Issue
Universally Composable Oblivious Transfer with Low Communication
Previous Article in Journal
Detection of Low Frequency Seismicity at Mt. Vesuvius Based on Coherence and Statistical Moments of Seismic Signals
Previous Article in Special Issue
A Topology Based Automatic Registration Method for Infrared and Polarized Coupled Imaging
 
 
Communication
Peer-Review Record

A New Monocular Pose Estimation Method for the Coplanar P4P Problem

Appl. Sci. 2023, 13(1), 183; https://doi.org/10.3390/app13010183
by Xudong Yu 1 and Yang Shang 2,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Appl. Sci. 2023, 13(1), 183; https://doi.org/10.3390/app13010183
Submission received: 11 November 2022 / Revised: 15 December 2022 / Accepted: 20 December 2022 / Published: 23 December 2022
(This article belongs to the Special Issue Advances in Applied Optics and Optical Signal Processing)

Round 1

Reviewer 1 Report

In places, the English could use some improvement. For instance, the repeatedly used phrase "a one and only analytic solution" is awkward; saying "a unique solution" would be better. Also, sometimes they refer to "control points" and sometimes to "characteristic points." This can confuse the reader. They should be consistent.

Section 2.1 is titled "Projection relationships of the characteristic points of a parallelogram's four vertices." The phrase "characteristic points of a parallelogram's four vertices" is meaningless. I suppose that they mean something like "Projection relationships of the characteristic points when these are a parallelogram's four vertices," or something like that. Section 2.2 is titled "Solution of the coplanar P4P problem that the four points are a parallelogram's four vertices." Saying "that" is bad English here. Replace "that" with "when." 

The authors need to be a little slower and more careful in setting up the problem in Section 2. Here they are working in the camera's reference frame, and so set R = I and T = 0, but in the experiments Section 3, they are clearly computing non-trivial rotations and translations. This is quite confusing.

In Section 2, they speak about comparing their method with "the iterative algorithm." Should that be "an iterative algorithm"? Anyway, they need to be clear about the details of the iterative algorithm. In Section 3, I get quite lost. They compute differences between the estimates from the iterative method and from their new method. What is needed though is the differences between the estimates from each of these methods, and the ACTUAL positions of the control points. Then the errors of each method could be compared to see which one works better. They claim to have computed the mean errors for their method in the last column of their table, but NOT the mean errors for the old method, so I see no basis for comparing the two methods. This is a very serious deficiency. 

 

Author Response

Lots of thanks go to all the reviewers and editors. We have revised our paper and would like to reply the comments as following.

 

  • Reviewers' comment 1:In places, the English could use some improvement. For instance, the repeatedly used phrase "a one and only analytic solution" is awkward; saying "a unique solution" would be better. Also, sometimes they refer to "control points" and sometimes to "characteristic points." This can confuse the reader. They should be consistent.

Reply 1:The English of the paper has been improved. Especially, many phrases have been consisted.

  • Reviewers' comment 2:Section 2.1 is titled "Projection relationships of the characteristic points of a parallelogram's four vertices." The phrase "characteristic points of a parallelogram's four vertices" is meaningless. I suppose that they mean something like "Projection relationships of the characteristic points when these are a parallelogram's four vertices," or something like that. Section 2.2 is titled "Solution of the coplanar P4P problem that the four points are a parallelogram's four vertices." Saying "that" is bad English here. Replace "that" with "when." 

Reply 2:Above section titles and other similar places have been revised.

  • Reviewers' comment 3:The authors need to be a little slower and more careful in setting up the problem in Section 2. Here they are working in the camera's reference frame, and so set R = I and T = 0, but in the experiments Section 3, they are clearly computing non-trivial rotations and translations. This is quite confusing.

Reply 3:To measure the pose parameters of a parallelogram object, the method proposed in this paper includes two steps:

Step 1. To calculate the real coordinates of the parallelogram’s four vertices by taking the camera system as a reference system .

Step 2. To calculate the parallelogram’s pose parameters relative to the camera based on the four vertices’ known coordinates in the camera system and the object system , respectively.

The first step is the main content of our method. In the first step, the object system is not involved yet, the camera system is taken as a reference system. So the camera’s translation vector is a zero vector and the rotation matrix is an identity matrix.

In the second step, the object system is defined. Then the translation vector and rotation matrix between the object system and the camera system are used to describe the object’s pose. They are non-trivial of course.

In the revised paper, we have added above description in Section 2 and adjusted the structure of this section according the two steps. Then, we added a figure to explain the second step. And we used new symbols to represent the object points, the systems, the coordinates and the pose parameters to avoid possible confusion.

Reviewer 2 Report

The paper considers P4P problem and identifies a simplification of the solution if the four points form the vertices of a parallelogram. Exploiting the geometry of the parallelogram, the authors propose that only one analytical solution will exist if the distance between any two vertices is known. If the scaling information is unknown, then a scale factor will obviously remain between the real position and the results. The proposed conclusions are also supported by experimental results.

The results and conclusion of this paper are valid and the overall writing style is also good. The results seem to be useful to some practitioners. 

I recommend the paper for publication after minor corrections (see some hints below).

- There are perhaps typos on line 2 and 7 of the abstract.

- First 2 and a half lines in the Experimental results section seem to be some dummy text from a template. 

- I like the brief and direct style adopted by the authors in this paper, but I think experimental results could be explained a little more. I have been struggling to understand what is exactly "Mean Errors of the vertices’ coordinates" and how is it calculated? (there is only a brief comment in point 4) of the experimental results)

Author Response

Lots of thanks go to all the reviewers and editors. We have revised our paper and would like to reply the comments as following.

  • Reviewers' comment 4:In Section 2, they speak about comparing their method with "the iterative algorithm." Should that be "an iterative algorithm"? Anyway, they need to be clear about the details of the iterative algorithm. In Section 3, I get quite lost. They compute differences between the estimates from the iterative method and from their new method. What is needed though is the differences between the estimates from each of these methods, and the ACTUAL positions of the control points. Then the errors of each method could be compared to see which one works better. They claim to have computed the mean errors for their method in the last column of their table, but NOT the mean errors for the old method, so I see no basis for comparing the two methods. This is a very serious deficiency.

Reply 4:Our method (New P4P) is to solve the parallelogram object’s pose parameters relative to the camera and the coordinates of its four vertices when one of the side lengths is known, and all of the vertex coordinates are unknown. Here the “iterative algorithm” (Old P4P) is actually the Bundle Adjustment method (BA), a kind of iterative correction and optimization algorithm from initial values, which is famous in Photogrammetry and Computer vision for its performance of high accuracy. For the Old P4P, the parallelogram vertices’ coordinates are known, and only the parallelogram’s pose parameters are to be solved.

We have not compared the accuracy of the two methods, because:

The Old Method takes the parallelogram vertices’ coordinates as known. MoreoverBA is a recognized high-precision algorithm. While, our New P4P solves the pose parameters from fewer conditions (the vertices’ coordinates are unknown), so the accuracy is undoubtedly inferior to that of BA. The advantage of our new P4P method is to achieve pose estimation under less known conditions than what normal P4P methods require. Pose estimation accuracy is not our new method’s advantage. And in the experiments, the true values of the object’s pose parameters relative to the camera are difficult to get. Therefore, to compare the pose estimation results of our new P4P method with that of BA method, the recognized high accuracy method, is to show the correctness and effectiveness of the new method, rather thanits advantage of accuracy. In terms of measuring the parallelogram vertices’ coordinates, under above conditions, our new method can do it but traditional methods can not. In the experiments, the shape and size of the parallelogram are controllable, so we compared the measurement results of the vertices’ coordinates by our new method with the true values to show the measurement errors. (Shown in the last column of Table 1. The old method can not calculate the vertices’ coordinates.)

In the revised paper, we have explained these situations and summarized the characteristics of our new P4P method relative to traditional p4p methods in the section on experiments.

  • Reviewers' comment 5:The paper considers P4P problem and identifies a simplification of the solution if the four points form the vertices of a parallelogram. Exploiting the geometry of the parallelogram, the authors propose that only one analytical solution will exist if the distance between any two vertices is known. If the scaling information is unknown, then a scale factor will obviously remain between the real position and the results. The proposed conclusions are also supported by experimental results.

Reply 5:Thank you for your approval.

  • Reviewers' comment 6:The results and conclusion of this paper are valid and the overall writing style is also good. The results seem to be useful to some practitioners. I recommend the paper for publication after minor corrections (see some hints below). - There are perhaps typos on line 2 and 7 of the abstract.

Reply 6:Thank you for your approval. We have found and revised some text errors in the paper. The English of the paper has been improved.

  • Reviewers' comment 7:- First 2 and a half lines in the Experimental results section seem to be some dummy text from a template. - I like the brief and direct style adopted by the authors in this paper, but I think experimental results could be explained a little more. I have been struggling to understand what is exactly "Mean Errors of the vertices’ coordinates" and how is it calculated? (there is only a brief comment in point 4) of the experimental results)

Reply 7:We are so sorry to forget to delete the dummy text from the template in the original version. We have revised it. We have explained more about the experimental results and summarized the characteristics of our new P4P method relative to traditional p4p methods in the section on experiments. We have also explained how to calculate the “Mean Errors of the vertices’ coordinates”, that is, the Mean Errors , whereare the measurement results of the ith vertex’ coordinates in the object system, and  are their true values.

Reviewer 3 Report

The submission lacks any conclusive, interpretation and focused thought. This is not sufficient for a good scientific paper. The authors need to clearly state what is relevance of this study, because this is not apparent. Why was this study undertaken?

Similarly, I suggest rewriting the abstract, focusing more on the ideas that the research is delivering instead of the raw results. More exactly, it would be useful to eliminate the unnecessary details.

The introduction is often too deep into the topic, detailed and repetitive but it does not lead in a clear and concise way to what you want to do in the study. Please review it. 

The article is very hard to read and is not comprehensible to a wider audience.  It requires extensive improvements, which are beyond the scope of a major revision.

Author Response

Lots of thanks go to all the reviewers and editors. We have revised our paper and would like to reply the comments as following.

  • Reviewers' comment 8:The submission lacks any conclusive, interpretation and focused thought. This is not sufficient for a good scientific paper. The authors need to clearly state what is relevance of this study, because this is not apparent. Why was this study undertaken? Similarly, I suggest rewriting the abstract, focusing more on the ideas that the research is delivering instead of the raw results. More exactly, it would be useful to eliminate the unnecessary details.

Reply 8:We have rewritten the abstract and the introduction to explain the application requirement, significance and ideas of our research. Similarly, we have eliminated some unnecessary details of the method. Here, we would like to explain these contents as follows.

  • The P4P problem is a classical problem of visual-based pose estimation. When the four points’ coordinates are known or the four points forming a rectangle, the P4P problem can be solved by traditional algorithms.
  • For the pose and shape estimation of parallelogram objects with unknown shape parameters, such as some parallelogram structures on buildings and the parallelogram formed by a walk man’s body standing at two positions (discussed at the end of the paper), the solution conditions for traditional P4P algorithms do not exist.
  • We presented a new P4P method for parallelogram objects without shape parameters, which can obtain the object’s pose parameters (translation vectors and rotation angles relative to the camera) and shape parameters(the four vertices’ coordinates).
  • The basic idea and process of the method are as follows: First, we take the camera system as a reference system to simplify the expression of the projection relationships, and reduce the number of unknown values according to the relative coordinate constraints of the parallelogram’s four vertices. Then, we establish and solve a set of linear equations. Then, by using a side length of the parallelogram, we can obtain the coordinates of the four vertices and complete the measurement of the shape parameters. With the coordinates of the four vertices, we can establish the object system on the parallelogram, and calculate the coordinates of each vertex in the object system. With the four vertices’ coordinates in the camera system and the object system respectively, we can solve the transformation parameters between the two systems to obtain the position and attitude of the object relative to the camera, and finally complete the pose parameter measurement.

We have supplemented these descriptions in the paper while eliminating some unnecessary details.

 

  • Reviewers' comment 9:The introduction is often too deep into the topic, detailed and repetitive but it does not lead in a clear and concise way to what you want to do in the study. Please review it. The article is very hard to read and is not comprehensible to a wider audience.  It requires extensive improvements, which are beyond the scope of a major revision.

Reply 9:As mentioned above, we have rewritten the abstract and the introduction to explain our research's main ideas and methods. We have added more explanation about the method principle and the experiments. We have also improved the expression and the English of the full text to make the paper comprehensible to audiences.

Round 2

Reviewer 3 Report

The authors have revised the manuscript according to the comments.

Back to TopTop