Deep Learning for Skeleton-Based Human Activity Segmentation: An Autoencoder Approach
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis work studied on human activity recognition based on skeleton data. The study presented a segmentation method and a deep learning (auto encoder) to enhance recognition performances. Three public datasets were used to evaluate the presented method. Although this research provides useful findings,, there are some issues that need to be addressed before publication.
1) Improve the abstract by adding the study's main findings and insights.
2) The introduction should be improved by provided clearly main contributions of this work.
3) Increasing the resolution of Figure 5 is recommended.
4) Please discuss computational complexity of the presented method.
5) To further strengthen the paper, please provide an analysis of the constraints or drawbacks of the presented approach and explore possible directions for future investigations.
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsIn their work, the authors consider an interesting and actively developing area related to the recognition and classification of human actions based on his skeletal model. The authors chose the task of automatic segmentation of recorded human actions into separate segments corresponding to different types of activity as the main task. The problem in this area is to find points of separation between different scenarios/types of activities. As a solution, the authors propose an autoencoder architecture for generalizing data on motor activity. The work is certainly interesting, complete, contains a large amount of theoretical and experimental material, comparison and justification of the selected model parameters. The number of references to existing studies and their novelty are acceptable. I can definitely recommend it for publication.
Some notes on the work:
1. Algorithm 2 requires some clarification, since its formalization contains several variables not specified in the text. I ask the authors to give a more understandable and detailed description of this algorithm, paying special attention to the process of choosing stopping points.
2. There are some typos, for example, eror in Figure 4.
3. I may not have noticed in the text, but neither in Figure 4 nor in Section 4 did I find information about the specific size of the input and output data. It is clear from the text that the authors used 25x3=75 values of the skeletal model, but some sequence of model positions is fed to the input of the model (for some time). Do I understand correctly that this issue is disclosed in section 4.2, where the final size of the input and output of the model is determined (from 50x75 to 200x75)? On the other hand, in line 516, I see the phrase "75 input features". I ask the authors to clarify this point.
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors have addressed all the previous comments.