Next Article in Journal
Generalized Persistence for Equivariant Operators in Machine Learning
Next Article in Special Issue
Research on Forest Fire Detection Algorithm Based on Improved YOLOv5
Previous Article in Journal
A Survey on GAN Techniques for Data Augmentation to Address the Imbalanced Data Issues in Credit Card Fraud Detection
 
 
Article
Peer-Review Record

Human Action Recognition-Based IoT Services for Emergency Response Management

Mach. Learn. Knowl. Extr. 2023, 5(1), 330-345; https://doi.org/10.3390/make5010020
by Talal H. Noor
Reviewer 1:
Reviewer 2:
Reviewer 3:
Mach. Learn. Knowl. Extr. 2023, 5(1), 330-345; https://doi.org/10.3390/make5010020
Submission received: 6 February 2023 / Revised: 2 March 2023 / Accepted: 9 March 2023 / Published: 13 March 2023
(This article belongs to the Special Issue Deep Learning in Image Analysis and Pattern Recognition)

Round 1

Reviewer 1 Report

A real-time emergency response management-focused IoT services architecture based on 65

human action recognition is presented. Different algorithms for fall detection are investigated. However, there exist some issues in the current manuscript.

--There are grammar issues, e.g., mistakes of singular and plural forms; please correct them in the revised manuscript.

--The image resolution of figure 1 is too low.

--The methods listed in Table 1 should be described clearly. E.g,, how are features extracted from video clips? Which layer are the features extracted from? What are the feature dimensions? How are the CNN models used for classification? Do you replace the last classification layer and re-train/fine-tune the CNN? In general, there are important details missing in the method and experiment sections.

 

Author Response

Response to the Editor-in-Chief:

Response: Thank you and I am grateful for your precious time and vital comments. According to the comments given, I have modified the manuscript significantly to improve the quality of the work. I  have taken the comments from the referees and addressed them in the revised version of the manuscript. The modifications are highlighted wherever it is applicable.

 

 

The Answers to the Referee’s Comments:

Thank you for pointing out that the work could be understood in this way. Based on your valuable comments, I have carried out some work on the manuscript. I am grateful for that as I have got important ideas that helped me to improve the quality of the work.

 

The Answers to the Reviewer 1

 

Reviewer 3:

 

Comment 1

There are grammar issues, e.g., mistakes of singular and plural forms; please correct them in the revised manuscript.

 

Answer 1

I am grateful for this crucial comment, and I have revised the manuscript based on the comments. The grammatical and spelling errors have been checked carefully.

 

Comment 2

The image resolution of figure 1 is too low

Answer 2

 

Thank you very much. Figure 1 is replaced by a new one.

 

 

Comment 3

The methods listed in Table 1 should be described clearly. E.g,, how are features extracted from video clips? Which layer are the features extracted from? What are the feature dimensions? How are the CNN models used for classification? Do you replace the last classification layer and re-train/fine-tune the CNN? In general, there are important details missing in the method and experiment sections

Answer 3

 

Thank you very much for the decisive comment. The feature dimension is employed using the matric of Action feature (Eq.4). In addition, I have added detailed information about CNN model used for classification.

 

Quotation from (Page 9, Section 4.3.2 (

 

 

 

 

Author Response File: Author Response.docx

Reviewer 2 Report

The problem working on is quite interesting. However, the authors need to more clearly introduce the technical novelty. The methods seems to be a combination of existing techniques. Also authors need to report the speed of the method. 
It will be interesting if authors can compare the method with RGB video based solutions. Using IoT leading to better or worse results? 
Authors also need to discuss the disadvantages of using IoT for human action analysis. For example, can this be used for long-distance action recognition? Say, action recognition from UAV. Author are suggested to discuss this.

Author Response

Response to the Editor-in-Chief:

Response: Thank you and I am grateful for your precious time and vital comments. According to the comments given, I have modified the manuscript significantly to improve the quality of the work. I have taken the comments given by the referees and the comments are addressed in the revised version of the manuscript. The modifications are highlighted wherever it is applicable.

 

 

The Answers to the Referee’s Comments:

Thank you for pointing out that the work could be understood in this way. Based on your valuable comments, I have carried out some work on the manuscript. I am grateful for that as I have got important ideas that helped me towards improving the quality of this work.

 

The Answers to the Reviewer 2

 

Reviewer 1:

 

Comment 1

Moderate English changes required

Answer 1

I am grateful for this crucial comment, and I have revised the manuscript based on the comments. The grammatical and spelling errors have been checked carefully.

 

Comment 2

The authors need to more clearly introduce the technical novelty

Answer 2

 

I thank the reviewer for the constructive comment. In the revised version of the manuscript, every attempt has been made to make the technical novelty clearer.

 

Quotation from (Pages 2 and 9, Sections 1 and 4.3.2(

 

 

 

Comment 3

It will be interesting if authors can compare the method with RGB video based solutions. Using IoT leading to better or worse results?

 

Answer 3

 

Thank you very much for the important comment. The system was implemented using the UR fall detection dataset and promising results were obtained, in which using IoT led to better results. In fact, this point is very important, since I will work on developing it in the future. I have elaborated on this matter in the updated version of the manuscript.

 

Quotation from (Pages 13, Sections 6(

 

 

 

Comment 4

Authors also need to discuss the disadvantages of using IoT for human action analysis. For example, can this be used for long-distance action recognition? Say, action recognition from UAV. Author are suggested to discuss this.

Answer 4

 

Thank you very much for the constructive comment. In fact, gathering real-world dataset from IoT camera devices is intended as part of the future work. This dataset can contain more challenging images such as emergency incidents images that contain crowds and other emergency incidents images captured from long distances (e.g., images collected by drones or Unmanned Aerial Vehicles (UAVs)), and compare current results with the more challenging ones. I have discussed these ideas in the updated version of the manuscript.

 

 

Quotation from (Pages 13, Sections 6(

 

Author Response File: Author Response.docx

Reviewer 3 Report

The author presents an architecture for gathering fall information and relaying it to the appropriate emergency responder. They also present an algorithm for classifying a video as containing a person falling. Their algorithm is a hybrid convolutional neural network (CNN) and support vector machine (SVM). They compare their results to several other methods.

While the author states what they feel the contribution of the manuscript are, the only potentially novel research contribution is the fall classification algorithm. The architecture for connecting the fall classification to emergency responders is an engineering implementation problem, not a research problem. It is also not clear if the architecture was implemented or tested.

While the fall classification algorithm is compared to several other methods, it is not clear whether these methods were previously applied to the fall database that the author uses. Since the database is public, there must be other authors who have evaluated fall prediction on the database. The author should compare to these methods.

The manuscript can be considerable shortened by focusing on what is potential novel about the work that is the fall identification algorithm. The emergency response architecture should be removed from the manuscript.

More detail is needed on the architecture of the CNN, such as number and types of layers, and training parameters. A diagram would be appropriate for the whole CNN/SVM.

The labels on the confusion matrices do not make sense. Which cells are the true positives, and which are the true negative?

 

Author Response

Response to the Editor-in-Chief:

Response: Thank you and I am grateful for your precious time and vital comments. According to the comments given, I have modified the manuscript significantly to improve the quality of the work. I have taken the comments given by the referees and addressed them in the revised version of the manuscript. The modifications are highlighted wherever it is applicable.

 

 

The Answers to the Referee’s Comments:

Thank you for pointing out that the work could be understood in this way. Based on your valuable comments, I have carried out some work on the manuscript. I am grateful for that as I have got important ideas that helped me to improve the quality of the work.

 

The Answers to the Reviewer 3

 

Reviewer 3:

 

Comment 1

English language and style are fine/minor spell check required

 

Answer 1

I am grateful for this crucial comment, and I have revised the manuscript based on the comments. The grammatical and spelling errors have been checked carefully.

 

Comment 2

While the author states what they feel the contribution of the manuscript are, the only potentially novel research contribution is the fall classification algorithm. The architecture for connecting the fall classification to emergency responders is an engineering implementation problem, not a research problem. It is also not clear if the architecture was implemented or tested

Answer 2

 

Thank you very much for this comment.  In fact, the system was implemented using the UR fall detection dataset and promising results were obtained. And the proposed work represents an important research point according to the contribution mentioned on Page 2, Section 1. In addition, more details about the implementation have been added on Pages 10-11, Section 5.

 

Quotation from (Pages 2, 10 - 11, Sections 1 and 5 (.

 

 

Comment 3

The manuscript can be considerable shortened by focusing on what is potential novel about the work that is the fall identification algorithm. The emergency response architecture should be removed from the manuscript.

Answer 3

 

Thank you very much for the constructive comment.  In fact, this point is very important, since more details on the implementation of the emergency response architecture have been added to the updated manuscript, and therefore I see the inevitability of its existence.

 

Quotation from (Pages 10 - 11, Section 5 (.

 

 

 

Comment 4

More detail is needed on the architecture of the CNN, such as number and types of layers, and training parameters. A diagram would be appropriate for the whole CNN/SVM.

Answer 4

 

In addition, more detailed information about the CNN model used for classification is added in the updated manuscript ;

 

Quotation from (Page 9, Section 4.3.2 (

 

Comment 5

The labels on the confusion matrices do not make sense. Which cells are the true positives, and which are the true negative?

Answer 5

Many thanks for your comment, In Figure 5” Confusion matrix for EfficientNetB0, MobileNetV2, ResNet50, Xception, method 1 and method 2” , A confusion matrix is a grid of information that shows the number of True Positives [TP] cell22, False Positives [FP] cell 21, True Negatives [TN] cell11, and False Negatives [FN] cell12.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

The author has addressed my concerns and improved the manuscript substantially.

Author Response

Thank you for your valuable comments and for pointing out that all of the comments have been addressed. I hope that the manuscript now can be accepted for publication.

 

Reviewer 3 Report


Comments for author File: Comments.pdf

Author Response

The Answers to the Referee’s Comments R2:

Thank you for pointing out that the work could be understood in this way. Based on your valuable comments, I have carried out some work on the manuscript. I am grateful for that as I have got important ideas that helped me to improve the quality of the work.

 

The Answers to the Reviewer 3

 

Reviewer 3:

 

Comment 1

The author clarified the contribution of the architecture in the current version of the manuscript, and that the architecture was implemented. The author now provides the architecture of their system.

Answer 1

Thank you for this comment. I am glad that the previous comment has been addressed.

 

Comment 2

 

 

The author’s response to this reviewer indicates which cells correspond to TP (cell22), FP (cell 21), TN (cell11), and FN (cell12). However, the labelling on the confusion matrices in Figure 6 still does not convey that information. Something like the following would help the reader.

Predicted

Positive

Negative

Actual

Positive

TP

FN

Negative

FP

TN

           

Answer 2

 

Thank you very much for your observation, and for this reason, Figure 6 has been re-clarified to remove any confusion and ambiguity

 

 

Comment 3

The author did not address this comment from my previous review: While the fall classification algorithm is compared to several other methods, it is not clear whether these methods were previously applied to the fall database that the author uses. Since the database is public, there must be other authors who have evaluated fall prediction on the database. The author should compare to these methods.

Answer 3

 

Thank you very much for the decisive comment. Yes, this is true, the comparison presented in this work is between different CNN architectures including EfficientNetB0, MobileNetV2, ResNet50, and Xception. Additionally, we used two different methods including method 1 where the SVM classifier was used to identify the features only and method 2 where CNN and SVM are combined for the whole process. To the best of my knowledge, I am the first to use such deep learning techniques on the UR fall dataset. Therefore, I believe other authors who are planning to use deep learning techniques on this dataset can compare with my work which will make the comparison fair.

 

 

 

 

Author Response File: Author Response.pdf

Round 3

Reviewer 3 Report

11. The author did a good job addressing this reviewer’s comment about clarifying the confusion matrices.

 

22. The author did a good job addressing the comment from my previous review about the comparison to other algorithms. I would suggest that the author add the following sentence or one similar to it somewhere early in the paper where they talk about their contributions.


“To the best of our knowledge, this is the first use of deep learning techniques on the UR fall dataset”

Back to TopTop