SCGFormer: Semantic Chebyshev Graph Convolution Transformer for 3D Human Pose Estimation
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsIn the Abstract it is important to include a brief presentation (2 sentences) of theoretical underpinning and context of research- so my suggestion would be to revise this section, providing also a couple of sentences more about the results. In the Introduction, the first paragraph could follow and instead include a more general in orientation paragraph focused on important concepts of the paper. In the Introduction section it is important to include more references, based on existing research ,so as to justify the importance and characteristics of SCGFormer- it is also important to include reference on the term, used for the first time in the paper. It is important also to include research questions in the Introduction section. A good idea would be to include a Literature Section in which authors could include literature on relative work, so as to justify the presented model, in order to base this in relative, older models. A section of Methodology is also important to be included and then present clearly the model architecture, followed by characteristics and initial experiments. A Discussion section could be included so as authors to comment on the use of their model architecture in relation with respective , previous literature.
Comments on the Quality of English LanguageAcademic genre is well presented- language use is fine.
Author Response
Thank you for taking the time to review the manuscript and providing valuable suggestions. Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThe paper discusses the importance of using 2D human joint information to predict 3D human skeletons accurately. The SCGFormer model is proposed in this paper to improve the accuracy of predicting human skeletal information in 3D space. The SCGFormer model consists of two interconnected modules: SGraAttention and AcChebGconv. SGraAttention extracts global feature information from each 2D human joint to enhance local feature learning, while AcChebGconv broadens the receptive field for graph structure information and aggregates valuable neighboring feature information. Experimental results on benchmark datasets like Human3.6M and MPI-INF-3DHP show the good performance of SCGFormer. I read this paper with great interest and this paper is well written and organized. I have few comments and hope can improve the paper.
1.The paper lacks a comprehensive discussion of the limitations and challenges associated with leveraging 2D joint data. While it is suggested that effectively leveraging 2D joint data can improve accuracy, the specific limitations and challenges that arise when working with such data are not addressed. This lack of discussion can limit the understanding of the potential drawbacks and caveats associated with the proposed method.
2.Although the network architecture of SCGFormer incorporates Transformer and two distinct types of graph convolution, the paper does not thoroughly explain the rationale behind this choice. It is important to provide a detailed justification for the selection of these specific components and how they address the challenges in predicting 3D human skeletal information. Without such explanations, it is difficult for readers to evaluate the suitability and effectiveness of the proposed architecture.
3. While the paper briefly describes SGraAttention and AcChebGconv, can you provide a more comprehensive evaluation of the individual contributions and limitations of each module? This helps to understand how these modules improve the accuracy of predicting human skeletal information and whether there are any limitations or trade-offs associated with their use.
4. What is the proposed model in the paper for enhancing the accuracy of predicting human skeletal information in 3D space?
5. How is the network architecture of SCGFormer structured, and what are the two interconnected modules?
6. How does SGraAttention contribute to the model's ability to predict 3D human skeletons? What is its role in feature extraction?
7. How does AcChebGconv enhance the model's predictions by leveraging graph structure information? What is its impact on the receptive field?
Author Response
Thank you for taking the time to review the manuscript and providing valuable suggestions. Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThe research presented in the article is important for the development of many areas prone to automation, where AI algorithms can complement/support expert opinion through accuracy of measurement or speed of analysis. Novelty and contribution should be highlighted in the Introduction section.
Limitations of one's approach and directions for further research should be described in detail in the Discussion section, also considering applications e.g. in clinical screening of posture or gait (in geriatrics, neurology, orthopaedics or rehabilitation). Are different versions of the proposed solution possible (e.g. simple, low-cost with less accuracy and professional?)? Are computational analyses based on e.g. fuzzy logic or fractal analysis the way forward?
Author Response
Thank you for taking the time to review the manuscript and providing valuable suggestions. Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 4 Report
Comments and Suggestions for AuthorsThe authors should:
(1) Use pseudocode for the description of the method;
(2) Use a statistical test for the comparison of the examined methods;
(3) Present some information about the time efficiency of the methodsl
(4) Present the limitations of their and propose how these could be handled in a future work.
Comments on the Quality of English LanguageMinor editing of English language required
Author Response
Thank you for taking the time to review the manuscript and providing valuable suggestions. Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe Abstract, though presents important information, needs revision in the sense of synthesizing in a more effective manner, respective content- and avoiding repetition of phrases/words. The Introduction section could provide more information about the body of knowledge (important concepts) of this work.The meaning in lines 12-16 is not clear- it also needs revision. The research design and methods selected could be better presented, so as to capture the comprehension of the average reader.There is no literature reference for CNN model. Section 2.2. could be more extended presenting important details, related with this work and connected with literature review and previous research. Section 4.4 needs elaboration.
Comments on the Quality of English LanguageLanguage use is fine- academic genre well attended.
Author Response
Thank you for taking the time to review the manuscript and provide valuable suggestions. Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for Authorsno further comment. thanks!
Author Response
Thank you for taking the time to review the manuscript and provide valuable suggestions.
Reviewer 4 Report
Comments and Suggestions for AuthorsIn the conclusion. the authors should mention the limitations of their study and how these could be handled in a future work.
Comments on the Quality of English LanguageMinor editing of English language required
Author Response
Thank you for taking the time to review the manuscript and provide valuable suggestions. Please see the attachment.
Author Response File: Author Response.pdf