Finger Multi-Joint Trajectory Measurement and Kinematics Analysis Based on Machine Vision
Round 1
Reviewer 1 Report
Comments and Suggestions for Authors
Reviewer Comments
1. Overall comments
The manuscript proposed a method to implement real-time and accurate finger motion capture by using a high-speed camera and MediaPipe app. The high-speed camera records the finger movement and feeds into the MediaPipe app which transforms frame images into key points and calculates the coordinates to form the trajectory. The proposed method was compared with the D-H method and artificial keypoint alignment method, and it was concluded that this method is more accurate and efficient. This work provides an efficient and accurate finger motion capture method, which could be applied to rehabilitation and health care. The language and writing quality needed to be improved and some revisions are required before the manuscript can be published. Please see the specific comments below.
2. Specific comments
1) There are many typos and grammar mistakes in the writing. For example, line 51, line 71, line 97, line 137, line 146, line 199, line 207, line 218. The manuscript needs some editing for language and writing quality.
2) The authors claimed that this work focuses on finger rehabilitation but there is no literature related to finger/hand rehabilitation in the Introduction part. Please add 1-2 paragraphs to review the work on motion measurement of the finger/hand rehabilitation process and explain the motivation of this research.
3) In line 15, please use the full name of MCP, PIP and DIP for general readers.
4) In line 19, please add the potential application of the proposed method.
5) In line 34, why did the authors replace the collision detection of the entire hand with the collision between the object and motion trajectory? Please explain the advantages of this.
6) In line 71, “Furthermore, images captured by high-speed cameras no have post-correction circuits and are recorded in black and white, preserving the integrity of the original data”, what does “no have” mean? Please clarify and correct it.
7) Figure 3-8, 10-11, 13, 15, 17 are too low-resolution to be read. Please replace them with high-resolution figures.
8) In line 113, what does “satisfy the kinematic characteristics of the finger joints” mean? Please clarify it.
9) In lines 117-119, describing the experiments that have been done need to use past tense.
10) In line 195, the subtitle of 4.2 is a sentence, and consider to change it.
11) In line 219, “takes no time” is not an accurate description. What data transmission method is used between the high-speed camera and the app and how long does it take to transmit one frame image? Please add relevant information.
12) The conclusions part needs more details about the limitations of the proposed method and possible improvements for future work. For example, the use of high-speed cameras is not cost-effective and not practical to be used for real-world applications. A possible solution could be training a deep-learning model to learn from images taken by a high-speed camera, which can be used to interpolate the finger trajectory from a normal camera in real-world applications.
Comments for author File: Comments.pdf
Comments on the Quality of English LanguageThere are many typos and grammar mistakes in the writing. For example, line 51, line 71, line 97, line 137, line 146, line 199, line 207, line 218. The manuscript needs some editing for language and writing quality.
Author Response
Response to Reviewers:
Thanks for your comments on our manuscript. We have revised our manuscript according to your comments:
- Reviewer #3 Specific comments(There are many typos and grammar mistakes in the writing. For example, line 51, line 71, line 97, line 137, line 146, line 199, line 207, line 218. The manuscript needs some editing for language and writing quality..):
To reviewer 3’ answer:
Thanks you for your advice.
We are happy to accept and value statement errors in text. We have made the following corrections to the content of the article.
Revise the original 51 lines to section 1 of the article.
Revise the original 97 lines to section 2 of the article.
Revise the original 137 lines to section 3.2 of the article.
Revise the original 146 lines to section 3.3 of the article.
Revise the original 199 lines to section 4.1 of the article.
Revise the original 207 lines to section 4.2 of the article.
Revise the original 218 lines to section 4.2 of the article.
- Reviewer #3 Specific comments(The authors claimed that this work focuses on finger rehabilitation but there is no literature related to finger/hand rehabilitation in the Introduction part. Please add 1-2 paragraphs to review the work on motion measurement of the finger/hand rehabilitation process and explain the motivation of this research...):
To reviewer 3’ answer:
Thanks you for your advice.
Your comments have been accepted. For details, please refer to the first chapter of the original article. The rehabilitation process and principle of the rehabilitation finger robot are added. And explain the motivation of this study. It also adds citations to relevant papers. You can see this in the first section of the article.
- Reviewer #3 Specific comments(In line 15, please use the full name of MCP, PIP and DIP for general readers.):
To reviewer 3’ answer:
Thanks you for your advice.
Your opinion has been accepted. However, due to the modification of the summary, this part of the content was deleted. However, your comments are reflected in line 100 of the article.
- Reviewer #3 Specific comments(In line 19, please add the potential application of the proposed method.):
To reviewer 3’ answer:
Thanks you for your advice.
Thanks for your suggestion, we have corrected this problem with line 21 of the original. Please refer to line 21 of the original article for details.
- Reviewer #3 Specific comments(In line 34, why did the authors replace the collision detection of the entire hand with the collision between the object and motion trajectory? Please explain the advantages of this.):
To reviewer 3’ answer:
Thanks you for your advice.
Change the random movement of gestures into a grip on a fixed object. The influence parameters of the experiment can be greatly reduced. The experimental results show that this strategy can significantly reduce the interference of non-target influence parameters during the experiment. The experimental efficiency and data accuracy are effectively improved. The interference of external factors is eliminated, and the reliability and repeatability of experimental results are guaranteed..
We are sorry for the trouble caused by your reading. See Section 1 for specific changes to this article.
- Reviewer #3 Specific comments(In line 71, “Furthermore, images captured by high-speed cameras no have post-correction circuits and are recorded in black and white, preserving the integrity of the original data”, what does “no have” mean? Please clarify and correct it.):
To reviewer 3’ answer:
Thanks you for your advice.
We are deeply sorry for the trouble caused by our improper wording to your reading. We made changes in the original text. See Section 1 for specific changes to this article.
- Reviewer #3 Specific comments(Figure 3-8, 10-11, 13, 15, 17 are too low-resolution to be read. Please replace them with high-resolution figures.):
To reviewer 3’ answer:
Thanks you for your advice.
We are very sorry for the trouble caused by the clarity of the picture. We have corrected all the images in the original. Please check out the related pictures in the original article. In particular, for figures 7 and 8, we combined these two figures into one. Hope to facilitate your reading.
- Reviewer #3 Specific comments(In line 113, what does“satisfy the kinematic characteristics of the finger joints”mean? Please clarify it.):
To reviewer 3’ answer:
Thanks you for your advice.
We apologize for the confusion caused by the language expression.What we were trying to say was, “Whether the motion trajectory of the finger rehabilitation robot can match the motion trajectory of the finger joints of normal people is the key index to measure the effectiveness of the finger rehabilitation robot”.
The above has been amended in Section 3.2 of the original.
- Reviewer #3 Specific comments(In lines 117-119, describing the experiments that have been done need to use past tense..):
To reviewer 3’ answer:
Thanks you for your advice.
Thank you very much for your reminding, this is completely caused by our negligence. We have changed the tense of the sentence to the past perfect, which you can see in section 3.2 of the article.
- Reviewer #3 Specific comments(In line 195, the subtitle of 4.2 is a sentence, and consider to change it.):
To reviewer 3’ answer:
Thanks you for your advice.
Thank you very much for your reminder. We will change the title to “Comparative analysis of three methods” You can see in section 4.2 of the article.
- Reviewer #3 Specific comments(In line 219, “takes no time” is not an accurate description. What data transmission method is used between the high-speed camera and the app and how long does it take to transmit one frame image? Please add relevant information):
To reviewer 3’ answer:
Thanks you for your advice.
Your question is very good, we all attach great importance to it.
The storage mode of the high-speed camera can be directly set to store pictures. According to the experimental running process mentioned in Figure 3, the picture is directly imported into the program, and the position coordinates of each node can be automatically identified and obtained.
Regarding the time loss during the transmission process. It can automatically run, store and transfer through a simple program. The process is nearly synchronous, and time statistics cannot be performed.
The phrase "takes no time" in this article is a very unprofessional description. We have professionally quantified the amount of time the system takes. It takes 5 seconds per 500 images. And the description in the article has been corrected. For more information, see Section 4.2 of this article and Table 5.
- Reviewer #3 Specific comments(The conclusions part needs more details about the limitations of the proposed method and possible improvements for future work. For example, the use of high-speed cameras is not cost-effective and not practical to be used for real-world applications. A possible solution could be training a deep-learning model to learn from images taken by a high-speed camera, which can be used to interpolate the finger trajectory from a normal camera in real-world applications.):
To reviewer 3’ answer:
Thanks you for your advice.
We agree with your question and have corrected the article. In our opinion, the shortcoming of this experiment is that the cost threshold in practical application is too high. Specific changes are made in the sixth chapter of the original.
Again, the authors thanks for your suggestion.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThe proposed manuscript presents a study regarding the measurement of finger trajectories using machine vision. Although the subject offers significant potential, several important considerations must be addressed:
a) Minor comments:
1a. Please enhance the resolution and clarity of all images within the figures. Especially in the case of figure 3, where it is quite difficult to read that table of values
2a. Is movement analysis possible for other fingers as well? If so, how can it be achieved?
3a. What is the relevance of Figures 7 and 8 to the analysis proposed by the authors?
4a. Probably the authors should reformulate the sentence between lines 176 and 177. The manuscript would benefit from more precise wording.
b) Major comments:
Authors states: "the accuracy of the trajectory measurement based on MediaPipe method is much higher than that of D-H method". The comparison is questionable for the following reasons:
1b. I'm curious how authors measured the trajectories using the DH method
2b. A single camera is used for motion analysis. Achieving an optimal camera angle to exclusively record finger flexion-extension is problematic. Therefore, the precision of trajectory calculations can be compromised.
3b. The coordinates for the DH model presented by the authors in figure 10 also include the rotation of the finger around the vertical axis. It is about the abduction movement in medical terms. Did the authors incorporate abduction movement into the DH model?
4b. Sometimes slight abduction occur even when a finger is primarily performing flexion and extension, due to uneven strength in the muscles or habitual movement patterns. The experimental setup did not allow for the capture of this movement.
Consequently, in this case, the DH model is unsuitable as a reference point for comparison. To enhance the study's credibility, a comparison with sensor-derived values is recommended. Or the authors should contextualize their results by comparing them to previous studies..
Author Response
Response to Reviewers:
Thanks for your comments on our manuscript. We have revised our manuscript according to your comments:
- Reviewer #1Major comments(I'm curious how authors measured the trajectories using the DH method):
To reviewer 1’ answer:
Thanks you for your advice.
We have added Table 1 on finger data in Chapter 1. In Section 4.1, Table 1and 3 on D-H method parameters has been added. And write the derivation formulas (1) to (5). Obtain the formulas (6) to (8) for the displacement coordinates.
- 2. Reviewer #1Mayorcomments(A single camera is used for motion analysis. Achieving an optimal camera angle to exclusively record finger flexion-extension is problematic. Therefore, the precision of trajectory calculations can be compromised.):
To reviewer 1’ answer:
Thanks you for your advice.
Your inquiry regarding the limitations of single cameras in terms of accuracy is both pertinent and significant. Allow me to clarify this matter further. This question was considered at the outset of the experiment. Existing studies (Reference 22) also employ single camera technology to measure hand trajectory, demonstrating its effectiveness and feasibility. Additionally, to mitigate potential interference factors such as angle and lighting, we have implemented improvements to the accuracy of the experiment. The experimental platform depicted in FIG. 4 and FIG. 5 has been specifically designed to meet the demands for precise measurement of finger motion trajectories. This information has been incorporated into Section 3.2, along with the relevant references added to Reference 22.
- 3. Reviewer #1Mayor comments(The coordinates for the DH model presented by the authors in figure 10 also include the rotation of the finger around the vertical axis. It is about the abduction movement in medical terms. Did the authors incorporate abduction movement into the DH model?):
To reviewer 1’ answer:
Thank you for your advice.
Regarding the error in Figure 10, this is due to an oversight. This article corrects the error in Figure 10. Since the gestures tested in this paper do not involve finger abduction movements, this paper does not include finger abduction movements in the DH model. We deeply apologize for any confusion caused by our mistake.
You can see this in Figure 9 in Section 4.1 of the article.
- 4. Reviewer #1Mayor comments(Sometimes slight abduction occur even when a finger is primarily performing flexion and extension, due to uneven strength in the muscles or habitual movement patterns. The experimental setup did not allow for the capture of this movement.):
To reviewer 1’ answer:
Thanks you for your advice.
Your question is quite correct. Really, Sometimes slight abduction occur even when a finger is primarily performing flexion and extension, due to uneven strength in the muscles or habitual movement patterns. From the perspective of this experiment, it is really impossible to capture the outward motion. But,you only need to put the camera in the vertical direction of the finger to capture the exhibition. In view of the fact that the abduction and adduction movements of the fingers have relatively little functional impact on daily activities of the human body. Therefore, this experimental design does not include such finger movements in the category of capture and analysis. The above has been added to Section 3.3. The references were added to 23.
- 5. Reviewer #1Mayor comments(Consequently, in this case, the DH model is unsuitable as a reference point for comparison. To enhance the study's credibility, a comparison with sensor-derived values is recommended. Or the authors should contextualize their results by comparing them to previous studies.):
To reviewer 1’ answer:
Thanks you for your advice.
In this study, the D-H (Denavit-Hartenberg) method was selected as the comparative framework because it is a basic and widely recognized analytical method in the field, and its design principles are closely aligned with the physiological characteristics of the finger.
The question you raised is very constructive. Comparing sensor derived values is also a very good method. We conducted a series of comparative experiments with the content in reference 1, which not only enriched the analytical dimensions but also enhanced the persuasiveness of the results.
We draw on the research strategy in Reference 23, using the DIP and PIP perspectives to construct a comparison for key parameters. It should be mentioned that due to the different degree of curvature of the selected gestures, there is a large difference in the value of the two, but the curve described by the two shows a highly consistent trend: that is, a gentle transition at the beginning, then a sharp rise, and a gradual trend again at the end of the movement. This finding not only validates the effectiveness of this experimental method in capturing finger motion dynamics, but also highlights its stability in the analysis of complex motion patterns.
In addition, the mathematical relationship between MCP and PIP, PIP and DIP under gripping motion is introduced. In 1, x is the bending Angle of MCP and y is the bending Angle of PIP. In 2, u is the PIP bending Angle and v is the DIP bending Angle. The average accuracy is 87.6% and 88.3%, respectively. This result not only confirms the reliability of the experimental method, but also significantly improves the scientific rigor of the study and the credibility of the conclusions.
The above has been added to Section 4.2.
1
2
(1)
(2)
- 6. Reviewer #1Minor comments(Please enhance the resolution and clarity of all images within the figures.Especially in the case of figure 3, where it is quite difficult to read that table of values.):
To reviewer 1’ answer:
Thanks you for your advice.
We apologize for the unclear picture. We have replaced Figure 3.
For details, see Section 3.1
- 7. Reviewer #1Minor comments(Is movement analysis possible for other fingers as well? If so, how can it be achieved?):
To reviewer 1’ answer:
Thanks you for your advice.
The questions you raise in exploring the test scope of this experimental design are very insightful and directly related to the versatility and flexibility of the experimental apparatus. The purpose of this paper is to describe an innovative experimental test device, which can realize the movement analysis of arbitrary fingers. If you want to replace the thumb with the index finger.
We take the following steps: First, select the hand shape that conforms to the characteristics of thumb movement as the basic conditions of the experiment; Secondly, by adjusting the position of the elbow, it ensures that the thumb can maintain the vertical state of the grasping Angle of the camera system during the whole movement, which is crucial for capturing accurate and distortion-free motion data. Subsequently, hand fixation devices and lighting devices are carefully adjusted to eliminate potential sources of interference and ensure standardization of the experimental environment. Finally, following the same data processing and analysis method as the previous analysis of the index finger movement, the thumb movement data are analyzed in depth. Through the above process, the scope of application and scientificity of this paper are expanded.
The above has been added to Section 3.2.
- 8. Reviewer #1Minor comments(What is the relevance of Figures 7 and 8 to the analysis proposed by the authors?):
To reviewer 1’ answer:
Thanks you for your advice.
The omission in the presentation of Figures 7 and 8 is indeed our mistake, which may have affected readers' understanding to some extent. In order to compensate for this deficiency and improve the clarity and coherence of the discussion, we have decided to reconstruct these two images and merge them into one image, labeled as Figure A and Figure B respectively.
Figure a, originally numbered Figure 8. The numbering system of hand key points in the MediaPipe system and the corresponding hand joint structure diagram are shown, which aims to clearly explain how the MediaPipe framework defines and identifies key parts of the hand. Figure b, originally numbered Figure 7. An example of matching between the actual image and the recognition result of MediaPipe system in the process of system analysis is presented.
This figure not only explains the work flow of the experimental system in detail, but also verifies the accuracy of the experimental process and results through examples, and further emphasizes the effectiveness and reliability of the overall experimental method.
The above has been added to Section 3.3.
- 9. Reviewer #1Minor comments(Probably the authors should reformulate the sentence between lines 176 and 177. The manuscript would benefit from more precise wording.):
To reviewer 1’ answer:
Thanks you for your advice.
We are deeply sorry for the omissions in the expression of lines 176 and 177. We have corrected it. lines 178 and 179 of the original. Now, go to section 3.3 of the article. Again, the authors thanks for your suggestion.
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThe paper is devoted to the measurement of finger trajectories using a high-speed camera. The article compares three image processing methods for studying finger trajectories: the keypoint alignment method, the Denavit Hartenberg(DH) method. The paper is written in a very detailed and understandable manner, however have some comments to authors. Please find them below.
1) Figures 3-8,10-11,13,15 are too poor quality. Please fix it.
2) Figure 7-8. Just combine two figures to figure 7 and add (a) and (b) labels.
3) Figure 9. What method of fitting the experimental results was used? In Figure 9(c), the trajectory is not fitted quite accurately. What is the average MSE of the fitted trajectories?
4) There is no description of the DH methods, the artificial keypoint alignment method, and the MediaPipe method in formulaic form.
5) subsection 4.2. it is necessary to add numerical results and summarize them in a table for clarity
6) Add numerical results of comparison of methods to Conclusions.
I believe that the article can be published after minor revision.
The research is good, but the article formatting, all the pictures and methods need to be corrected.
Author Response
- Reviewer #2(Figures 3-8,10-11,13,15 are too poor quality. Please fix it.):
To reviewer 2’ answer:
Thanks you for your advice.
We apologize for the unclear picture. We have replaced all the images you mentioned in the article, and you can view them in the article.
- Reviewer #2(Figure 7-8. Just combine two figures to figure 7 and add (a) and (b) labels.):
Thanks you for your advice.
We apologize for the presentation of the picture. At present, we have made corrections to the problem in Figure 7-8. You can see this in Figure 7 of the article.
The above has been added to Section 3.3.
- Reviewer #2( Figure 9. What method of fitting the experimental results was used? In Figure 9(c), the trajectory is not fitted quite accurately. What is the average MSE of the fitted trajectories?):
To reviewer 2’ answer:
Thanks you for your advice.
The Non linear Least Squares method was used for all three kinds of fitting. The specific fitting types, root mean square errors, etc. are listed in the supplementary table in section 3.3.
- Reviewer #2(here is no description of the DH methods, the artificial keypoint alignment method, and the MediaPipe method in formulaic form.):
To reviewer 2’ answer:
Thanks you for your advice.
“We welcomed and highly valued the valuable suggestion to enhance the formulaic description of the content, which significantly enriched the theoretical depth and systematization of the paper. Specifically, regarding the formulaic expression of the D-H method, readers can refer to subsections (1) to (8) in section 4.1 of this article. In addition, for further elaboration of the MediaPipe method, we also made a supplementary explanation in Table 3 of Chapter 3.3.
However, it is worth noting that. Although we strive to improve the formulaic level of the paper, as for the artificial keypoint alignment method, it cannot be accurately described through mathematical formulas because it is experiment-oriented in nature. Therefore, in this paper, we focus on the demonstration and analysis of experimental results to illustrate the characteristics and application effects of this method, in order to make up for its limitations in formulaic expression.
The above has been added to sections 4.1, 3.3, and 2.
- Reviewer #2( subsection 4.2. it is necessary to add numerical results and summarize them in a table for clarity):
To reviewer 2’ answer:
Thanks you for your advice.
We welcome and attach great importance to the valuable feedback on the lack of clarity in Section 4.2. In order to improve the clarity and understanding of this section, we have added Table 5 to Section 4.2. The purpose is to further strengthen the summary and summary of the core content of the chapter through this supplementary material. It enhances the logic of the content and significantly improves the readers' ability to grasp and understand the information in this chapter.
- 6. Reviewer #2(Add numerical results of comparison of methods to Conclusions.):
To reviewer 2’ answer:
Thanks you for your advice.
We gladly adopted and highly valued the valuable feedback regarding the inclusion of numerical results in section 4.2. We have taken the following measures. First, part of the numerical results are presented in tabular form, which not only enhances the readability of the data, but also facilitates readers to intuitively compare the performance differences between different methods. Subsequently, we carried out a detailed and professional redescription of the table content, aiming to improve the data support and professional depth of the content through accurate language expression.
This series of improvements not only enrich the data of the research results, but also significantly enhance the professionalism and persuasiveness of the paper, providing a more solid and reliable reference basis for the research in related fields.
Again, the authors thanks for your suggestion.
Author Response File: Author Response.pdf
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors have successfully addressed all of my feedback, resulting in a significantly improved manuscript.