Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Efficient Online Engagement Analytics Algorithm Toolkit That Can Run on Edge

Algorithms 2023, 16(2), 86; https://doi.org/10.3390/a16020086

by Saw Thiha

and Jay Rajasekera^*

Reviewer 1:

Erik Kučera

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Algorithms 2023, 16(2), 86; https://doi.org/10.3390/a16020086

Submission received: 10 December 2022 / Revised: 27 January 2023 / Accepted: 31 January 2023 / Published: 6 February 2023

(This article belongs to the Special Issue Advances in Cloud and Edge Computing)

Round 1

Reviewer 1 Report

This article deals with the use of machine learning in blink detection using videos from video conferences as dataset. The article is quite interesting, but some modifications are needed to improve its quality.

Comments and suggestion:

1. It was not at all clear to me from the abstract what the main thrust of the article was going to be. I recommend rewording the final part of the abstract.

2. The Related works section should be substantially expanded. The article needs to be put in some context.

3. "raspberry pi" should be written as "Raspberry Pi"

4. Expand the Conclusion section and write the main application or scientific contributions of your work in bullet points.

After major revision, the article can be judged again.

Author Response

Many thanks for the valuable comments. We feel your comments helped to improve the paper considerably.
## Comment 1
It was not at all clear to me from the abstract what the main thrust
of the article was going to be. I recommend rewording the final part
of the abstract
### Response
The contribution of this paper is threefold:
- An open-source online engagement analytics toolkit that supports eye
blink detection and face orientation detection
- Simple and accurate face orientation (i.e. horizontal and vertical)
detection using a nose landmark
- Improving existing Eye Aspect Ratio threshold for eye blink
detection and statistically proving that the face orientation is a
significant factor in that regard

## Comment 2
The Related works section should be substantially expanded. The
article needs to be put in some context.
### Response
We have added all the literature and the context surrounding the research.

## Comment 3
"raspberry pi" should be written as "Raspberry Pi"
### Response
We fixed the keyword capitalization, vocabulary and citation
conventions accordingly to the reviewers kind suggestions

## Comment 4
Expand the Conclusion section and write the main application or
scientific contributions of your work in bullet points.
### Response
We have concisely summarized our scientific contributions and the
possible application of our toolkit in the conclusion as bullet points.

Reviewer 2 Report

- Goals, aims, and motives are not clearly stated. It is not clearly stated why the authors need eye blink detection? Is it more useful for monitoring of driver in car than on videoconferencing?

- You should reorganize your paper in order to be more focused and to have clear organization of the paper. Introduction does not provide motive for the research, and there are too many unnecessary sentences.

- "The authors propose z-score normalization as an alternative to eliminate aforementioned external noises for robust head orientation and eye blink detection." Why? Can you make comparison analysis to prove that this score is the best?

- References should be updated with novel form 2021 and 2022.

- Reference [1] is not written according to the instructions. It is not clear is it web page, paper, book, or else.

- Please comment:

https://dx.doi.org/10.5944/openpraxis.12.4.1113

https://doi.org/10.1145/3411764.3445294

https://doi.org/10.1007/s10639-021-10597-x

https://doi.org/10.17577/IJERTV11IS060281 (https://www.researchgate.net/publication/361802180_Smart_Online_Examination_Anti-Cheat_System)

Author Response

Many thanks for the valuable comments. We feel your comments helped to improve the paper considerably.

## Comment 1
Goals, aims, and motives are not clearly stated. It is not clearly
stated why the authors need eye blink detection? Is it more useful for
monitoring of driver in car than on videoconferencing?
### Response
In this revised version, we have explained the context surrounding the research and our approach on the current problems regarding deploying face analytics in online engagements (scalability and privacy concerns) by developing an
open-source online engagement analytics toolkit that can directly run
on the audience devices. This way the audience can be assured that
they are not sharing any video across the web and they learn how the
analytics works as it is open-source. Also for the organizations and
meeting hosts, they will not need GPU or other expensive,
hard-to-scale hardware. We explained why eye blink is relevant for
proctoring and engagement analytics in the Introduction section. Thank
you so much for your kind suggestion as we incorporated it into the
context also.

## Comment 2
You should reorganize your paper in order to be more focused and to
have a clear organization of the paper. Introduction does not provide
a motive for the research, and there are too many unnecessary
sentences.
### Response
We reorganized our paper to reflect our intent and the context of the
research, including the Introduction section. We added many relevant references and sources that we could not in the first submission. We cut unnecessary sentences and wrote the content more concisely to reflect the significance of our
research, as much as we can.

## Comment 3
"The authors propose z-score normalization as an alternative to
eliminate aforementioned external noises for robust head orientation
and eye blink detection." Why? Can you make a comparison analysis to
prove that this score is the best?
### Response
In this revised version, we included some references to show the superiority of z-score standardization in machine learning. One of the references, for instance, demonstrated that z-score is superior when it comes to comparisons
across different samples. Also, the face orientation is basically the
z-score of the nose landmark since it shows the position of the nose relative to other parts.

## Comment 4
References should be updated with novel form 2021 and 2022
### Response
Many thanks; we have added many references that are up-to-date as suggested.

## Comment 5
Reference [1] is not written according to the instructions. It is not
clear if it is a web page, paper, book, or else.
### Response
Our apologies for the reference errors. We have fixed and added
relevant doi links, etc.

## Comment 6
Reference suggestions.
### Response
Thank you so much for the references. We have added those and many
other relevant references.

Reviewer 3 Report

This paper works on an interesting problem with reasonable solutions. The proposed approach to address the scalability issue makes sense, and the authors have done significant work to implement and evaluate the proposed algorithm.

However, we can improve this paper in the following areas:

This paper needs more information regarding the related work on online proctoring algorithms. This paper discusses related computer vision research, which is closely related to the approach proposed by this paper. However, it does not explain the related proctoring solutions in general.
This paper explains the individual modules, such as detecting head orientation direction and eye blink detection, but it is not obvious for a reader to understand how they are related to the topic of this paper – online video proctoring. We had better add a section to explain how these modules are used in webcam proctoring.
Section 3 evaluates the individual modules with the CMU dataset. However, it does not explain how the results are related to webcam proctoring. It would be great if this paper can provide some evaluation of "proctoring".
We can add a section to discuss how to deploy the algorithm/implementation to show its usefulness and effectiveness.

I guess the following two GitHub repos are the implementations of this paper. It would be great if the authors can include some details regarding the implementations in the paper.

https://github.com/sawthiha/mediapipe_calculators

https://github.com/sawthiha/mediapipe_graphs

Author Response

## Comment 1
This paper needs more information regarding the related work on online
proctoring algorithms. This paper discusses related computer vision
research, which is closely related to the approach proposed by this
paper. However, it does not explain the related proctoring solutions
in general.
### Response
According to the kind suggestions of the reviewers, we have added many
related research and reviews on the topic of proctoring to support the
context surrounding the research, as much as we can.

## Comment 2
This paper explains the individual modules, such as detecting head
orientation direction and eye blink detection, but it is not obvious
for a reader to understand how they are related to the topic of this
paper – online video proctoring. We had better add a section to
explain how these modules are used in webcam proctoring.
### Response
Many thanks for kindly pointing out this! In the introduction and related work sections, we documented the use cases of these analytics in proctoring purposes and more. Since we are trying to initiate a face analytics toolkit for proctoring and online engagement as a whole, we included those modules.

## Comment 3
Section 3 evaluates the individual modules with the CMU dataset.
However, it does not explain how the results are related to webcam
proctoring. It would be great if this paper can provide some
evaluation of "proctoring".
### Response
Our apologies for the confusion caused by the misleading 'solution'
in the title. We are simply providing an analytics toolkit to build
proctoring and online engagement solution, not the solution itself. We
have added necessary evaluations and fixed the confusion regarding the
scope we are in.

## Comment 4
We can add a section to discuss how to deploy the
algorithm/implementation to show its usefulness and effectiveness.
### Response
We believe that this is too is a good point. We describe implementation and deployment
details in the README file of the code base and we are currently working on reorganizing the code, writing elaborative documentation, and providing a demo application with one click compilation in Github.

Round 2

Reviewer 1 Report

I think that the article can be accepted now.

Author Response

Many thanks to Reviewer #1 for your great comments last time and accepting the paper after we revised. Really appreciate it!

Reviewer 2 Report

It would be better that you write explicitly aims, goals, and novelty of the paper in the Introduction (see answer to comment 1). The aim of the paper is... Novelties are... or Major contributions of the paper are....

I'm satisfied with answers to other comments.

Author Response

# Reviewer #2 Question'It would be better that you write explicitly aims, goals, and novelty of the paper in the Introduction (see answer to comment 1). The aim of the paper is... Novelties are... or Major contributions of the paperare....# AnswerThank you so much for the kind suggestion. We added the explicit statements on our aim (i.e. to propose an open-source initiative to make proctoring-related algorithms accessible, portable, scalable andcustomizable, as a toolkit) and the contributions as a separate Introduction Section in Introduction.

Reviewer 3 Report

Thanks for the revising. Compared to the previous version, this version adds related work on online proctoring algorithms and clarifies the motivation of the research. However, section 1 and 2 seems to be too long, while section 3 and section 4 are too short. I would suggest extending section 3 and section 4 to give more details of the system design and explain the contributions of this paper. Also, it would be great if this paper can give more information regarding the sample application. Currently, it only provides a few GitHub repos.

The system uses RT-BENE for evaluation, which only includes samples of the left eye. Can we use a different dataset with both eyes?

I have found many small writing issues. Some of them are the following:

Line 39: "to shape the we" -> "to shape the way we"

Line 56: "implication" -> "implications"

Line 67: "peers important for" -> "peers which is important for"

Line 82: ". Additionally" -> "Additionally"

Line 204: "can be call edge" -> "can be called edge"

Line 405: Looks like it should be "3.3.1"?

Line 441: Looks like it should be "4.2.1"?

Author Response

Reviewer 3# Question 1However, section 1 and 2 seems to be too long, while section 3 and section 4 are too short. I would suggest extending section 3 and section 4 to give more details of the system design and explain the contributions of this paper.The system uses RT-BENE for evaluation, which only includes samples of the left eye. Can we use a different dataset with both eyes?# AnswerThank you so much for the kind tips and guidance.

Found your comments regarding Section 1 and 2 were very justified; we are sorry that we did not notice, while writing. We have gone through Section 1 and removed many words and sentences (more than 700 words were removed). Section 2 too, we went over carefully and kept the necessary parts.

For Section 3 and Section 4, we realized the importance of your comments; really thankful.

We clarified our initiative for a proctoring toolkit which supports the analytics mentioned in the methodology, rather than a complete proctoring system. As for the system design, we clarified the overview of the algorithm, which illustrates why we firstly standardize the landmarks and then use them to calculate various analytics.We have newly added the explicit statements on our aim (i.e. to propose an open-source initiative to make proctoring-related algorithms accessible, portable, scalable and customizable, as a toolkit) and thecontributions as a separate section in Introduction.

Regarding the RT-BENE dataset, we improved and evaluated the eyeblink detection of two more datasets: Eyeblink8 and Talking Face; both datasets have left and right eyes that you asked us to use. Indeed, it effectively extended the experimental result section and reinforcedour claim to the robust improvement in F1 score and accuracy compared to existing EAR thresholding method. Really appreciate your comment, which improved the quality of the paper considerably.

# Question 2Also, it would be great if this paper can give more information regarding the sample application. Currently, it only provides a few GitHub repos.# AnswerWe include a simple application that uses the proposed algorithm in both C++ and Python (as a package you can use pip to download). Also, we restructured the project into one main repository so that the implementation in C++ and evaluation results can be seen altogether. Additionally, we also included a Python snippet and the installation instruction in the Appendix (news added, towards the end of the paper).

We have done a complete grammar and spell check. Many thank for kindly checking and reminding us on this important item too. Really appreciate all your comments.

Round 3

Reviewer 3 Report

Thanks for the revising. Compared with previous versions, this version's organization is much better. It clearly explains the motivations of the research and the contributions of the work.

I have found some small writing issues:

Line 149: "The authors want customizability and The authors aim" -> "the authors want customizability and aim".

Line 174: "MediaPipe is a framework by Google" -> built or developed by Google?

Line 199: "which support" -> "which supports".

Line 200: "face orientation detection" -> "face orientation detection approach".

Line 438: "the model detect" -> detects or detected?

Article Menu

Efficient Online Engagement Analytics Algorithm Toolkit That Can Run on Edge

Further Information

Guidelines

MDPI Initiatives

Follow MDPI