Next Article in Journal
Auto-Regressive Integrated Moving-Average Machine Learning for Damage Identification of Steel Frames
Previous Article in Journal
Design of a New Fermented Beverage from Medicinal Plants and Organic Sugarcane Molasses via Lactic Fermentation
 
 
Article
Peer-Review Record

An Empirical Evaluation of Online Continuous Authentication and Anomaly Detection Using Mouse Clickstream Data Analysis

Appl. Sci. 2021, 11(13), 6083; https://doi.org/10.3390/app11136083
by Sultan Almalki *, Nasser Assery and Kaushik Roy
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Appl. Sci. 2021, 11(13), 6083; https://doi.org/10.3390/app11136083
Submission received: 8 April 2021 / Revised: 22 June 2021 / Accepted: 24 June 2021 / Published: 30 June 2021
(This article belongs to the Section Computing and Artificial Intelligence)

Round 1

Reviewer 1 Report

In this paper, the authors developing solutions for user identification and authentication based on mouse-dynamics information and using machine learning methods. 

The presented idea is interesting and welcomed. There are several suggestions for improvement of your manuscript.

  1. Shortcomings of previous research should be highlighted more, and also the research gap that you have been identified and trying to fill.
  2. Is there really a need to analyze research that uses NSL-KDD dataset or KDD CuP 1999? those research trying to find an anomaly in network traffic they are not dealing with the behavioral biometrics?
  3. UML activity diagram or similar would be welcomed to visualize research phases and activities. 
  4. Research methodology should be clearer explain
  5. What tool did you use for feature extraction?
  6. In results point up best results. 
  7. ROC is not visible good enough. Try setting up 2 figures by the row, not four.
  8. The results should be validated compšared with results of other relevant research in the field.
  9.  Literature is significantly outdated. Please update your literature with newer references. 

Author Response

Dear Editor and Reviewers:

 

Thank you so much for providing valuable feedback and reviews. In the current version (first version), we addressed the comments given by the reviewers and editor. The changes are highlighted in turquoise color.

 

Overall Modifications:

 

  • We updated the literature with newer references and also highlighted a research gap.
  • Abstract and conclusion: We replaced detailed results with a general comment.
  • We improved the quality of all ROCs by setting up 2 figures by the row.
  • We improved the quality of other figures that needed improvements.
  • We improved all ROCs by setting up 2 figures by the row.
  • We added a UML activity diagram to visualize research phases and activities. 
  • A clear description of the research methodology has been
  • A clear description of the results has been added in the Abstract and also in the results section.
  • All the missing link/references have been added.
  • A clear description of the research methodology has been added.
  • A clear description of the resample and interpolate method has been added.
  •  

***********************************************************************

 

Reviewer #1:

 

Thank you for your valuable comments.

 

  1. Shortcomings of previous research should be highlighted more, and also the research gap that you have been identified and trying to fill.

 

Response: We updated the literature review and added two more references.

 

  1. Is there really a need to analyze research that uses NSL-KDD dataset or KDD CuP 1999? those researches trying to find an anomaly in network traffic they are not dealing with the behavioral biometrics?

 

Response: We deleted these two references and added two new references.

 

  1. UML activity diagram or similar would be welcomed to visualize research phases and activities. 

Response: A UML activity diagram has been added.

 

 

  1. Research methodology should be clearer explained.

Response: A description of the research methodology has been added in section 4.

 

  1. What tool did you use for feature extraction?

We used

 

Response: Pandas and numpy were the tools used for feature extraction.

 

  1. In results point up best results. 

Response: We updated the literature review and added two more references.

 

 

  1. ROC is not visible good enough. Try setting up 2 figures by the row, not four.

 

Response: ROCs are clear by setting up 2 figures in one row.

 

  1. The results should be validated compšared with results of other relevant research in the field.

Response: Our results are compared with two other articles in the conclusion section.

 

  1. Literature is significantly outdated. Please update your literature with newer references. 

 

Response: We updated the literature review and added two more references.

Reviewer 2 Report

The aim of the article is interesting and, as shown in section 2, it is widely studied. The percentaje of results obtained are very accuracy and very high however the tests set shown is limited to user participation in a game called "Perfect Piano". The sample of records is important, however limited to the interaction in an application ("Perfect Piano") on three scenarios where the movements that users can make are limited. In my opinion it would be necessary to expand the number of applications and scenarios so that they could also be chosen randomly by the participants and with them generate more records.

The biometrics movements of the mouse in people are different in the same person depending on the application that is using at a determinated moment. That is why, regardless of whether the subject of the article is interesting, it is necessary to expand the number of scenarios for the acquisition of data giving the possibility of a random selection of the same by the users, to obtain results not only centered in one application, and in this way can be obtained several models using the machine learning algorithms proposed in a widely form, and compare this results.

Regarding the presentation format of the article, it has clear and important deficiencies. Figures and equations are not centered, nor their titles. Figures 1, 7, 8 and 11 have to improve the quality. In several equations symbols are mising (i.e. 15, 17).

Author Response

Dear Editor and Reviewers:

 

Thank you so much for providing valuable feedback and reviews. In the current version (first version), we addressed the comments given by the reviewers and editor. The changes are highlighted in turquoise color.

 

Overall Modifications:

 

  • We updated the literature with newer references and also highlighted a research gap.
  • Abstract and conclusion: We replaced detailed results with a general comment.
  • We improved the quality of all ROCs by setting up 2 figures by the row.
  • We improved the quality of other figures that needed improvements.
  • We improved all ROCs by setting up 2 figures by the row.
  • We added a UML activity diagram to visualize research phases and activities. 
  • A clear description of the research methodology has been
  • A clear description of the results has been added in the Abstract and also in the results section.
  • All the missing link/references have been added.
  • A clear description of the research methodology has been added.
  • A clear description of the resample and interpolate method has been added.
  •  

***********************************************************************

Reviewer #2:

1-The aim of the article is interesting and, as shown in section 2, it is widely studied. The percentaje of results obtained are very accuracy and very high however the tests set shown is limited to user participation in a game called "Perfect Piano". The sample of records is important, however limited to the interaction in an application ("Perfect Piano") on three scenarios where the movements that users can make are limited. In my opinion it would be necessary to expand the number of applications and scenarios so that they could also be chosen randomly by the participants and with them generate more records.

The biometrics movements of the mouse in people are different in the same person depending on the application that is using at a determinated moment. That is why, regardless of whether the subject of the article is interesting, it is necessary to expand the number of scenarios for the acquisition of data giving the possibility of a random selection of the same by the users, to obtain results not only centered in one application, and in this way can be obtained several models using the machine learning algorithms proposed in a widely form, and compare this results.

Response:

Thank you for your valuable comments.

Our data is limited and focused on just movement and point-and-click action. Due to COVID-19 and the time given for the revision, it will be hard to collect samples from more users. Our lab is still closed. Most importantly, most of the current datasets have the number of users in the range of 10-20. For example, the Balabit dataset has 10 users. In addition, to collect our data, we used the standard guidelines and protocols maintained in other popular touch datasets.

 

2- Regarding the presentation format of the article, it has clear and important deficiencies. Figures and equations are not centered, nor their titles. Figures 1, 7, 8 and 11 have to improve the quality. In several equations symbols are mising (i.e. 15, 17).

Response:

Thank you for your valuable comments.

We improved the quality of all the figures. The equations are clear in our Microsoft Word; we will send a pdf file to make sure that everything is clear.

 

 

Reviewer 3 Report

Authors propose behavioral biometric analysis based on mouse movements.

The topic is very interesting, and the proposed work is worth to be considered for publication. However, authors must modify the content (i.e. not relevant information should be removed).

In the following detailed comments:

  • abstract: Please replace detailed results with a general comment (e.g. this algorithm performs better than others, or, comparable results have been obtained, etc.) Readers do not expect to find this kind of detail in abstract
  • Related work section should critically discuss other works in order to highlight their limits and how the proposed approach is able to overcome them. A list of methods is reported in the related work section. There is not a clear connection among these methods. Details as Accuracy, or other measures are not necessary in this section, and they make the reading difficult. This section must be strongly reduced by removing not relevant information, and, on the contrary, relevant one must be added: what are the main findings of the proposed approach? Why did you decide to develop it in this way? How is it able to solve problems of the previous methods? etc.
  • When data, code, or other sources are mentioned, a reference to their web pages must be added (e.g. Academia.edu dataset, AnyBeat dataset, Google+ dataset, etc.)
  •  Table 1: the content of the table is not clear. Authors must describe the columns of the table and their meaning. 
  • Paragraph 3.1: please add link/references to pyHook package and Windows Hooking API.
  • Paragraph 3.6.1: please remove the list of the two methods, it is not important the tool that has been used, whilst the method and how they work is relevant. So please add a brief description of the resample and interpolate method, and then add the links to their documentation pages
  • Paragraph 3.6.2.1: I suppose that all these features have been returned by the Python library. If so, it is not necessary to describe them in such details. You can refer to the library documentation page and add a brief description of each feature. Even because feature selection has been applied, thus not all these features have been used
  • Paragraph 5: it is not clear how did you create not genuine data. Since all the data refers to students' interaction with the game, you must explain which are not genuine interaction samples that you have used for training the model
  • Conclusion should summarize the work. As for the abstract, it is not required a high level of details (such as the performance of each classifiers). Please replace with more general sentences 

Author Response

Dear Editor and Reviewers:

Thank you so much for providing valuable feedback and reviews. In the current version (first version), we addressed the comments given by the reviewers and editor. The changes are highlighted in turquoise color.

Overall Modifications:

  • We updated the literature with newer references and also highlighted a research gap.
  • Abstract and conclusion: We replaced detailed results with a general comment.
  • We improved the quality of all ROCs by setting up 2 figures by the row.
  • We improved the quality of other figures that needed improvements.
  • We improved all ROCs by setting up 2 figures by the row.
  • We added a UML activity diagram to visualize research phases and activities. 
  • A clear description of the research methodology has been
  • A clear description of the results has been added in the Abstract and also in the results section.
  • All the missing link/references have been added.
  • A clear description of the research methodology has been added.
  • A clear description of the resample and interpolate method has been added.
  •  

***********************************************************************

Reviewer #3:

 

Thank you for your valuable comments.

Authors propose behavioral biometric analysis based on mouse movements.

The topic is very interesting, and the proposed work is worth to be considered for publication. However, authors must modify the content (i.e. not relevant information should be removed).

In the following detailed comments:

  1. abstract: Please replace detailed results with a general comment (e.g. this algorithm performs better than others, or, comparable results have been obtained, etc.) Readers do not expect to find this kind of detail in abstract.

 

Response:

We deleted the detailed results and added general sentences.

 

 

  1. Related work section should critically discuss other works in order to highlight their limits and how the proposed approach is able to overcome them. A list of methods is reported in the related work section. There is not a clear connection among these methods. Details as Accuracy, or other measures are not necessary in this section, and they make the reading difficult. This section must be strongly reduced by removing not relevant information, and, on the contrary, relevant one must be added: what are the main findings of the proposed approach? Why did you decide to develop it in this way? How is it able to solve problems of the previous methods? etc.

Response:

We updated the literature review and added two more references.

 

  1. Table 1: the content of the table is not clear. Authors must describe the columns of the table and their meaning. 

Response:

We explained all the columns of the table and defined each column.

 

  1. Paragraph 3.1: please add link/references to pyHook package and Windows Hooking API.

Response:

The citation issues are fixed (page # 8).

 

  1. Paragraph 3.6.1: please remove the list of the two methods, it is not important the tool that has been used, whilst the method and how they work is relevant. So please add a brief description of the resample and interpolate method, and then add the links to their documentation pages.

Response:

The list of the two methods has been removed, and we added a description of the resample and interpolate method.

 

  1. Paragraph 3.6.2.1: I suppose that all these features have been returned by the Python library. If so, it is not necessary to describe them in such details. You can refer to the library documentation page and add a brief description of each feature. Even because feature selection has been applied, thus not all these features have been used.

Response:

We deleted the unnecessary descriptions and kept a significant one.

  1. Paragraph 5: it is not clear how did you create not genuine data. Since all the data refers to students' interaction with the game, you must explain which are not genuine interaction samples that you have used for training the model.

Response:

We added a description on page #21.

  1. Conclusion should summarize the work. As for the abstract, it is not required a high level of details (such as the performance of each classifiers). Please replace with more general sentences.

Response:

In the abstract, we deleted the detailed results and added general sentences.

 

 

 

 

 

Round 2

Reviewer 2 Report

The new document contains the proposed improvements as well as the required explanations. However, as I commented in my previous review, it would be necessary to expand the number of applications and scenarios so that they could also be chosen randomly by the participants and with them generate more records, but I understand that the current situation does not allow it.

Reviewer 3 Report

Authors have modified the article according to the reviewers' suggestions.

Back to TopTop