Next Article in Journal
Effective Frequency Range and Jump Behavior of Horizontal Quasi-Zero Stiffness Isolator
Next Article in Special Issue
Anomaly Detection Method for Multivariate Time Series Data of Oil and Gas Stations Based on Digital Twin and MTAD-GAN
Previous Article in Journal
Simplified High-Performance Cost Aggregation for Stereo Matching
Previous Article in Special Issue
Hybrid Machine Learning–Statistical Method for Anomaly Detection in Flight Data
 
 
Article
Peer-Review Record

Is It Worth It? Comparing Six Deep and Classical Methods for Unsupervised Anomaly Detection in Time Series

Appl. Sci. 2023, 13(3), 1778; https://doi.org/10.3390/app13031778
by Ferdinand Rewicki 1,2,*, Joachim Denzler 2 and Julia Niebling 1
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3:
Appl. Sci. 2023, 13(3), 1778; https://doi.org/10.3390/app13031778
Submission received: 20 December 2022 / Revised: 19 January 2023 / Accepted: 23 January 2023 / Published: 30 January 2023
(This article belongs to the Special Issue Unsupervised Anomaly Detection)

Round 1

Reviewer 1 Report

As I reviewed the manuscript entitled “Is it worth it? An experimental comparison of six deep- and classical machine learning methods for unsupervised anomaly detection in time series.” in detail, I found a lot of flaws in the paper that should be fixed to meet the high standard of the Journal as well as the research community. In this article, the authors have proposed a model to detect anomalies from the time series data by using six different ML and DL methods. The presented work is good. However, some concerns need to be resolved, in the next revision, which is given as follows: 

Revision:

1.      A lot of grammatical mistakes throughout the manuscript should be removed, and scientific language needs to be used to meet the higher standard of the journal.  A lot of grammatical, spelling mistakes, and space issues throughout the manuscript.

2.      The title of the manuscript should be revised, the length of the title is so long, and avoid using a full stop at the end of the title.

3.      The framework of the proposed approach is missing in the manuscript.

4.      How do you set/calculate your THV and POT?

5.      What is the size of the sliding window concerning the selected method, discuss in detail.

6.      How do you calculate precision, Recall and F-Score against the proposed model, explain in detail.

7.      To justify results, there should be a detailed comparison table with recently published articles, how your selected methods are better than previous work to detect the anomalies from the selected dataset.

8.      How do you calculate your model performance?

9.      The result section is very weak. Please add more statistical analysis and compare your results with the state-of-the-art latest published manuscript.

 

10.  The manuscript should be revised by an English native speaker, for the improvement of the sentence structure and story of the paper.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The authors have presented a very elaborate and practically essential paper on the comparison between tools of anomaly detection. These minor comments below can be considered for further improvement:

1) Benchmark Dataset: UCR Anomaly Archive

Although a thorough description of this dataset has been included, it would be very relevant to describe what the anomalies practically mean in different domains. For example, a subtle change in air temperature due to an anomaly might have a very different clinical implication from respiration or ECG. Furthermore, given the context, there should be a short discussion of how one particular type of anomaly can be more "lethal" or detrimental than the other.

 

2) Please provide some intuition on the methods' performances if the hyperparameters were not tuned and the tools were used at their default parameters. This might be important when users across different disciplines do not have enough computational resources available to them.

 

3) Although the authors have used three different quality metrics to compare the performances, these measures yielded some intrinsically different results for the six methods. This gets more confusing while interpreting the results in Figures 4 and 6. Though the authors explain the subtle differences in the discussion, more details on these inherent differences should be added in "2.3.3. Quality Measures". For example, line 338-343, add a sentence mentioning why a "UCR score of 1 but a low F1-Score at the same time indicates the detection of the true anomaly.". This would certainly make the results easier to understand.

 

4) Though it is beyond the scope of the paper to discuss why the classical methods (even AE) perform much better, this paper warrants some intuition. For example, one direct intuition may be that in GANF, the particular choices of Bayesian priors might have led to over-fitting. Small plausible intuitive arguments could provide a beneficial start to further theoretical underpinnings.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

This manuscript was quite interesting and technically sound, however it still needs some improvement:

1-The current language state, particularly the logic flow, of the paper requires refinement

2- Provide some background information regarding the proposed approach. There must be a clear explanation of how the analyzed methods will be used in the description.

3- About the contribution, you presented the only improvement without analyzing drawbacks of the analyzed methodes; for example, you noticed orexpectedt computation overhead, or minor responsiveness because of the execution of proposed algorithms?

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop