Anomaly Detection in Fractal Time Series with LSTM Autoencoders

Kirichenko, Lyudmyla; Koval, Yulia; Yakovlev, Sergiy; Chumachenko, Dmytro

doi:10.3390/math12193079

Open AccessArticle

Anomaly Detection in Fractal Time Series with LSTM Autoencoders

¹

Department of Artificial Intelligence, Kharkiv National University of Radio Electronics, 61166 Kharkiv, Ukraine

²

Institute of Mathematics, Lodz University of Technology, 90-924 Lodz, Poland

³

Institute of Computer Sciences and Artificial Intelligence, V.N. Karazin Kharkiv National University, 61022 Kharkiv, Ukraine

⁴

Mathematical Modelling and Artificial Intelligence Department, Kharkiv Aviation Institute, National Aerospace University, 61072 Kharkiv, Ukraine

⁵

Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, Cambridge, MA 02139, USA

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(19), 3079; https://doi.org/10.3390/math12193079

Submission received: 31 August 2024 / Revised: 26 September 2024 / Accepted: 30 September 2024 / Published: 1 October 2024

(This article belongs to the Special Issue Recent Advances in Time Series Analysis)

Download

Browse Figures

Versions Notes

Abstract

:

This study explores the application of neural networks for anomaly detection in time series data exhibiting fractal properties, with a particular focus on changes in the Hurst exponent. The objective is to investigate whether changes in fractal properties can be identified by transitioning from the analysis of the original time series to the analysis of the sequence of Hurst exponent estimates. To this end, we employ an LSTM autoencoder neural network, demonstrating its effectiveness in detecting anomalies within synthetic fractal time series and real EEG signals by identifying deviations in the sequence of estimates. Whittle’s method was utilized for the precise estimation of the Hurst exponent, thereby enhancing the model’s ability to differentiate between normal and anomalous data. The findings underscore the potential of machine learning techniques for robust anomaly detection in complex datasets.

Keywords:

anomaly detection; Hurst exponent; fractal Brownian motion; machine learning; LSTM autoencoder

MSC:

68T07; 60G18; 37M10; 93C41

1. Introduction

It has been established that many technological, informational, physical, and biological processes exhibit complex fractal structures [1,2,3]. Fractal analysis is widely used for modeling, analyzing, and managing complex systems across various scientific and technical disciplines [4,5,6,7]. In medicine, it is employed for precise disease diagnosis; in economics, for predicting risks and crises through financial time series; in biology, for studying mutations and genetic changes; in geology, for determining the age of geological formations and forecasting seismic activity; and in physics, for investigating thermodynamic processes and turbulence.

Self-similar (fractal) processes exhibit several key properties:

➢: The fundamental property of self-similar processes is their invariance under time scaling. This means that the process looks statistically the same when the time scale is increased or decreased.
➢: Self-similar processes often exhibit long-range dependence, characterized by a slowly decaying autocorrelation function. In these processes, the correlation between values at successive time points decays not exponentially, as in classical stationary processes, but much more slowly, indicating the presence of long-term memory.
➢: Due to their unique properties, self-similar processes are widely used to model time series that exhibit fractal characteristics and long-range dependence, including economic data, hydrological processes, and biomedical signals like EEG.

The Hurst parameter H is a numerical characteristic used to describe the degree of self-similarity and long-term dependence in stationary and non-stationary time series. It is a key metric in the analysis of fractal processes and characterizes their behavior across different time scales.

Detecting significant changes in fractal time series is a complex task, particularly without employing fractal analysis. Such changes may indicate important issues or notable phenomena depending on the research domain.

Fractal properties of many time series arise from their specific correlation and spectral structures. These properties manifest in complex patterns that are not always evident when using traditional spectral or correlation analysis methods. However, for fractal series, changes in these properties can be tracked through fractal characteristics such as the Hurst exponent.

The Hurst exponent is a key parameter reflecting the internal structural properties of a time series and its changes. Therefore, to effectively monitor changes in the Hurst exponent, the use of a specialized neural network designed to detect anomalies in stochastic time series is proposed. In this context, fractal anomalies represent deviations from the normal fractal behavior of the system, which can arise due to technical failures, accidents, or intentional interventions.

The application of a neural network allows for the analysis of changes in the Hurst exponent obtained using a sliding window method, facilitating the dynamic monitoring of changes in time series and the detection of potential anomalies or outliers in the data. Thus, automating the process of anomaly detection through neural networks represents an effective approach for monitoring and analyzing complex systems.

2. Background

Fractal analysis is a critical tool for studying time series that exhibit self-similarity and long-term dependence. References [6,7,8] explore the key applications of fractal analysis methods across various scientific and engineering disciplines, highlighting their role in modeling and describing complex structures. In financial research, fractal models are employed to analyze financial time series and stock prices, accounting for nonlinear dependencies and self-similarity in market dynamics [9,10]. In biomedicine, fractal analysis is applied to study signals such as electroencephalograms (EEG)—recordings of the brain’s electrical activity—and electrocardiograms (ECG)—recordings of the heart’s electrical activity—enabling the identification of patterns associated with different physiological and psychological states [11,12,13]. Across these studies, it has been shown that significant anomalous changes in fractal properties, reflected by variations in the Hurst exponent, indicate substantial changes in the system’s state.

The accurate estimation of the Hurst exponent from time series data is a crucial prerequisite for conducting reliable fractal analysis. The Hurst exponent is a key parameter that characterizes the degree of self-similarity and long-term memory in time series. The works [14,15,16] provide an overview of the main methods for estimating the Hurst exponent, ranging from widely used classical approaches [1,4,5,17,18] to methods that leverage machine learning techniques [19,20,21].

Anomalous changes in the fractal properties of time series can signal significant events or structural shifts in the systems under analysis [22]. The literature in this area includes several directions: anomalous changes in the Hurst exponent are often regarded as indicators of crisis situations or sharp shifts in markets [9]; in biomedical applications, anomalous changes in the Hurst exponent can be linked to pathological conditions such as epileptic seizures or other neurological disorders [12,23]; and anomalies in fractal properties within data traffic often point to network infrastructure failures, cyberattack attempts, or other unusual events impacting traffic structure [24,25,26,27].

The detection of fractal anomalies in time series is a critically important task, particularly in the context of analyzing complex systems such as financial markets, network traffic, or biological signals. These anomalies can indicate fundamental changes in system dynamics that necessitate prompt detection for appropriate action. Therefore, automating the process of detecting such anomalies using machine learning methods is of particular importance. Automated detection significantly enhances the efficiency of monitoring and analysis, providing faster and more accurate responses to changes, which is especially crucial when dealing with large volumes of real-time data.

3. Study Objectives

A self-similar process with discrete time is a stochastic process that exhibits scale-invariance over time. This means that the statistical properties of the process remain unchanged when the time axis is rescaled. Formally, a discrete-time process X(t), t = 1,…, T, is called self-similar if, for any positive scaling factor a > 0, there exists a self-similarity parameter H (also known as the Hurst parameter) such that

Law{X(at)} = Law{a^H{X(t)},

(1)

where Law[·] denotes equality in distribution.

In the context of discrete time, this implies that a sample of the process, taken at equally spaced time intervals, has the same statistical structure as the original process, but with a scaled magnitude and time. Such processes are often used to model time series with fractal properties, such as financial data, hydrological processes, and signals obtained in biomedical research, including EEG.

By a fractal anomaly, we mean a situation in which, over a specific time interval [t1,t2] of the time series X(t), where the Hurst parameter H deviates from a baseline value H_base by an amount greater than a specified threshold ε. Formally, this is expressed by the following condition:

|H[t1,t2] − H_base| > ε,

(2)

where ε is the allowable deviation, the exceeding of which indicates the presence of an anomaly in the fractal properties of the time series X(t) within the interval [t1,t2]. This deviation indicates a change in the self-similar properties of the series in this range and may indicate significant changes in the dynamics of the process.

The objective of this study is to investigate whether changes in the fractal properties of a time series can be detected by transitioning from the analysis of the original time series to the analysis of the sequence of Hurst exponent estimates. Since Hurst exponent estimates are random variables and their variations may be subtle, we suggest employing an LSTM autoencoder neural network to detect anomalies in the sequence of estimates.

4. Materials and Methods

4.1. Fractional Brownian Motion and Fractional Gaussian Noise

Fractional Brownian motion (fBm) is an extension of classical Brownian motion that incorporates fractal properties and long-term dependencies [4,8,28]. It is a stochastic process characterized by the Hurst exponent H, which reflects the degree of self-similarity and correlation in the process.

fBm exhibits self-similarity, meaning that the time series X(t) retains its statistical structure when scaling both time and amplitude. That is, for fBm, relation (1) holds. Fractional Gaussian noise (fGn) is increments of fBm. Increments describe changes in the time series and also exhibit fractal properties. Increments ΔX (t) = X(t + Δt) − X(t) are Gaussian random variables. For fGn, this means they follow a Gaussian distribution with zero mean and variance that depends on the time interval Δt:

P (Δ X < x) = \frac{1}{\sqrt{2 π σ} {Δ t}^{H}} \int_{- \infty}^{x} e x p (\frac{- z^{2}}{2 σ^{2} {Δ t}^{2 H}}) d z

(3)

where σ₀ is the diffusion coefficient.

Exponent H determines the degree of long-range dependence of the fGn. The corresponding correlation function is expressed as follows:

Corr(t,s) = 0.5 × (|t|^2H + |s|^2H − |t − s|^2H),

(4)

where t and s are time indices.

Thus, fractional Gaussian noise provides a mathematical foundation for describing and analyzing time series with fractal and long-term dependencies, which is crucial for understanding and modeling complex systems. fGn enables the modeling of intricate time series with fractal properties and has a wide range of applications across various scientific and engineering disciplines. In particular, fGn is used for signal analysis, such as in electroencephalograms (EEGs), to detect changes in brain activity.

4.2. Whittle’s Method

The Whittle method is a technique for estimating the Hurst exponent that utilizes semiparametric estimates of the memory of a process and is based on the periodograms of time series [14,15,16].

Whittle’s method is based on estimating the spectral density S(f) of the self-similar time series X(t). In this case, it is assumed that the spectral density S(f) follows a specific form related to the Hurst parameter H. For fractal processes, the spectral density is often modeled as f~f^−2H−1.

Whittle’s method constructs a likelihood function based on the estimated spectral density. The likelihood function L(H) for the Hurst parameter H is given by the formula

L (H) = \prod_{i = 1}^{N} \frac{S (f_{i})}{\hat{S} (f_{i}, H)},

(5)

where

\hat{S} (f_{i}, H)

represents the theoretical spectral density at frequency f_i for a given H, and S(f_i) is the observed spectral density.

The Hurst parameter H is estimated by maximizing the likelihood function L(H). This estimation process finds the value of H that best fits the observed spectral density to the theoretical model.

Whittle’s method is highly regarded for estimating the Hurst parameter due to its effective use of spectral density, which captures frequency-domain information and enhances accuracy. It is statistically efficient, often providing precise estimates with smaller sample sizes and demonstrating robustness to noise. This property is of significant importance because physiological measurements are typically subject to high levels of noise, even in carefully controlled experiments. Such noise can distort data and make it difficult to isolate accurate estimates, especially when small samples are used, which is a common practice in experimental research [29]. Additionally, the method excels in analyzing long-range dependencies and fractal behavior, making it versatile across different types of time series data. Its theoretical foundation in likelihood estimation and spectral analysis further supports its reliability and flexibility.

4.3. Choice of Neural Networks

Neural networks have gained significant popularity in recent years due to their successful application in various tasks, including time series forecasting and anomaly detection. Recurrent neural networks (RNNs), particularly Long Short-Term Memory (LSTM) networks, are of particular interest in this context. Unlike conventional feedforward neural networks that process data as isolated instances, LSTM networks have the ability to retain information about previous inputs by creating a feedback loop where the output of one time step becomes the input for the next. This feature enables LSTM networks to effectively handle sequences of data and capture long-term dependencies. An LSTM block comprises four main components: the memory cell, the input gate, the output gate, and the forget gate. These components work together to maintain, update, and retrieve information throughout the sequence, allowing the model to accurately predict subsequent steps.

One application of LSTM is an LSTM autoencoder, which is a self-supervised neural network [30,31,32,33]. LSTM neural networks and LSTM autoencoders are both architectures based on LSTM. An LSTM neural network typically consists of one or more LSTM layers that are trained to capture temporal dependencies in the data. Following the LSTM layers, dense layers may be added for tasks such as classification or regression. The primary objective of an LSTM neural network is to model temporal dependencies and predict future values based on current data.

An LSTM autoencoder consists of two parts: an encoder and a decoder. The encoder compresses the input data, reducing its dimensionality, and learns to create a latent representation of the data. Typically, the encoder is composed of LSTM layers. The decoder reconstructs the original data from the compressed latent representation and is often also built with LSTM layers. The primary task of the autoencoder is to learn to reconstruct the input data.

Both types of networks can be used for anomaly detection in time series. An LSTM neural network produces predictions that can be compared with actual data to identify anomalies. An LSTM autoencoder reconstructs the input data, allowing for comparison with the original data to detect deviations and anomalies.

In this method, the model receives a time series X(t) as input,

X(t) = {x1, x2,…, xT},

(6)

and then attempts to reproduce the same sequence at the output.

The encoder in the LSTM autoencoder architecture plays a crucial role in processing time series data by transforming them into a compact representation, which is then used for various tasks such as forecasting and anomaly detection. The LSTM encoder takes a time series sequence X(t) as input. It passes through several LSTM layers that extract temporal dependencies and long-term patterns from the data. After processing the input sequence, the encoder typically produces a fixed-size vector z, representing the final hidden state of the LSTM. This vector contains compressed information about the entire sequence, enabling the decoder to reconstruct the original data.

For forecasting tasks, the encoder can be extended with additional layers and mechanisms. For instance, a layer can be added to transform the code z into a forecast vector. The model can be trained on historical data and then used to predict future values of the time series.

LSTM encoders are also effectively applied to anomaly detection. After the model is trained on normal data, it is used to reconstruct new sequences. The main idea is that the model should reproduce normal data with minimal error, while anomalies will result in significant reconstruction errors.

During anomaly detection, the model calculates the reconstruction error as the difference between the original and reconstructed sequence:

E r r o r = |x (t) - \hat{x} (t)|

(7)

where x(t) is an actual value, and

\hat{x} (t)

is the predicted value at time step t. If the reconstruction error exceeds a predefined threshold, the data sequence is considered anomalous. This threshold can be determined based on a statistical analysis of the reconstruction errors on the training dataset.

In the training process of the autoencoder, the encoder passes the compact representation z to the decoder. The decoder reconstructs the original data based on this representation, using a reverse LSTM architecture. The decoder may also include additional layers to improve the accuracy of reconstruction and forecasting. After training is complete, the decoder is removed, leaving only the encoder, which can be used for processing new data and detecting anomalies.

All software utilized for the experiments reported in this study was developed using the Python programming language version 3.10.12. Python is a high-level, object-oriented, general-purpose programming language with open-source licensing, which has become one of the most popular and widely used languages worldwide. It is characterized by its flexibility, readability, and easy-to-remember syntax. Due to its extensive library ecosystem, Python enables the resolution of complex problems across various fields. Currently, several libraries have been developed in Python that facilitate the modeling of fractal Brownian motion through various approaches. These libraries also include programs that enable the estimation of the Hurst exponent from time series data, including the Whittle method.

5. Computational Experiment: Study of Changes in the Hurst Exponent Using LSTM Autoencoders

To conduct the experiment, time series of fractional Gaussian noise (fGn) X (t) were simulated using a given Hurst exponent H1 with a length of t = 1, N samples. At some random moment M, the value of the Hurst exponent changed to H2, artificially creating a fractal anomaly in the series.

For analyzing the changes in the Hurst exponent, a software-implemented Whittle estimator was applied. This method was combined with a sliding window technique, which allowed the calculation of the Hurst exponent H within each window of a specified interval length T. As a result of these calculations, a new time series Hx (t) was generated, consisting of Hurst exponent values. The elements of this series were determined by moving a window of size T along the time series X (t) and applying the Whittle method to compute the Hurst exponent for each segment, with the window shifting by a specified number of elements p at each iteration. The resulting time series of Hurst exponents Hx (t) with a length Lw = abs (N − T)/p + 1 was used as input data for the neural network (autoencoder). Thus, we perform time series resampling based on a sliding window, transforming the original time samples into a window index. We will use this term to denote the new time index for the Hurst exponent series.

In the initial experiments, we employed an LSTM neural network comprising three hidden layers with 64, 256, and 100 neurons. These layers were designed to capture temporal dependencies and complex patterns in the data. The first layer with 64 neurons captured initial temporal dependencies, the second layer with 256 neurons learned more intricate and long-term dependencies, and the third layer with 100 neurons refined the learned representations to enhance anomaly detection. The network was trained on normalized time series data, where it learned to reconstruct normal time series, with anomalies detected as deviations from these reconstructed values, thus allowing the effective identification of unusual or unexpected patterns. The model was successful in detecting single anomalies, such as spikes; however, despite its effectiveness in detecting isolated anomalies, it demonstrated limitations in identifying abnormal changes in time series level. Therefore, based on the results of studies [31,32], a LSTM autoencoder model was chosen to detect anomalies.

The recurrent neural network autoencoder was implemented within the LSTM_Autoencoder class, which includes several functions for model creation, training, prediction, and result visualization. Figure 1 illustrates the architecture of the neural network specifically employed in our experiment to detect changes in the Hurst exponent.

LSTM autoencoder model is composed of an encoder and a decoder. The architecture consists of seven layers with the following configuration: The encoder includes three LSTM layers with 5, 16, and 1 neuron(s), respectively, followed by a RepeatVector layer with 5 neurons. The decoder takes the sequence returned by the encoder’s RepeatVector layer and attempts to reconstruct the input sequence using two LSTM layers and a dense layer. The dense layer accommodates the number of input and output features. The TimeDistributed layer applies the dense layer to each vector in the sequence individually, returning sequential outputs from the decoder. The decoder mirrors the encoder with LSTM layers having 5 and 16 neurons, respectively, and concludes with a TimeDistributed dense layer with 1 neuron.

The compile method is used with the Adam optimizer and Mean Squared Error (MSE) loss function to compile the model. The network is trained for 10 epochs with a batch size of 32. It features a ReLU activation function for all LSTM layers.

To train the neural network, the autoencoder was provided with a centered time series consisting of Hurst exponent values H1, computed using the Whittle method with a sliding window technique. This time series was derived from fGn with a constant Hurst exponent value H1, indicating the absence of anomalies or abrupt changes in the fractal structure of the series. By utilizing a centered time series of Hurst exponent estimates, the approach avoided dependency on a specific value of H1, instead focusing solely on the change in the Hurst exponent

ΔH = abs(H1 − H2).

(8)

The objective of training the autoencoder was to enable the network to efficiently encode and decode sequences of Hurst exponent values characteristic of non-anomalous processes. The network aimed to develop a latent representation of the original time series while minimizing the reconstruction error for data where the Hurst exponent remained constant. This approach allowed the autoencoder to establish a pattern for normal time series behavior, which could then be utilized to detect deviations from this pattern, indicating the presence of anomalies.

For the formation of the training and test datasets, we generated 500 time series of fractal Gaussian noise (fGn) with varying Hurst exponents (H) for the training set. Each series contained 1,000,000 data points. From these series, a sliding window of size T = 400 was applied to extract sequences of Hurst exponents, each with a length of 50,000 data points, which were considered as normal cases for training the model.

The test dataset consisted of 100 time series, half of which contained anomalies. The anomaly point, defined as the time of change in the Hurst exponent, was selected randomly. Each test time series of fGn had a length of 30,000 data points. Correspondingly, a sliding window of the same size was used to extract sequences of Hurst exponents with a length of 1500 data points, providing both normal and anomalous cases for evaluating the model’s performance.

The primary task of the neural network was to identify anomalies, specifically by detecting abrupt changes in the Hurst exponent. In this context, the neural network was responsible for classifying time segments where significant alterations in the fractal structure of the series occurred, thereby signaling the presence of anomalies.

Initially, the method was applied to model time series generated based on fractional Brownian motion with various Hurst exponents. These synthetic time series included both data with a constant Hurst exponent and data with abrupt changes between values H1 and H2, allowing us to evaluate the method’s effectiveness in detecting known anomalies.

In the subsequent phase, after verifying the method on synthetic data, it was applied to real EEG recordings. In this context, it was hypothesized that the Hurst exponent values in the EEG data might vary due to different physiological or cognitive states. By applying the proposed method to these real EEG data, we aimed to identify potential fractal anomalies. This analysis enabled us to assess how well the method performs on time series where changes in the Hurst exponent were not explicitly predefined.

6. Results and Discussion

Let us examine the process of detecting anomalous changes in the Hurst exponent using a model series of fractional Gaussian noise. To train the neural network based on the LSTM autoencoder architecture, synthetic fGn series with various Hurst exponent values were employed.

Figure 2 displays 500 elements of an fGn series for H = 0.7. These data represent the characteristic behavior of a process with the specified Hurst exponent and illustrate the typical time series structure required for the subsequent training of the autoencoder.

The Hurst exponent was calculated for time series window of size T = 400 with a step size of p = 20 using the Whittle estimator. For an input fractional Brownian motion series of length N = 1,000,000, the resulting time series of the Hurst exponent estimates, Hx (t), has a length of Lw = 50,000 as a result of resampling and the first 5000 values are shown in Figure 3. This series illustrates the typical estimation errors associated with the Hurst exponent. The series Hx (t) is then centered and used for subsequent neural network training.

The autoencoder model does not utilize a direct accuracy metric. Instead, an inverse metric based on the loss function is used to assess the model’s performance. The loss function calculates the average deviation between the input data and their reconstructed values, such as through the mean squared error. A low loss value indicates that the model effectively reconstructs the input data with minimal error, while a high loss value suggests greater reconstruction error.

The inverse metric, derived from the loss function, provides an indirect measure of model performance, particularly when standard metrics like accuracy are not applicable. The inverse metric fr can be calculated using the following formula:

Inverse Metric = 1/(1 + L),

(9)

where L is the loss value computed by the model’s loss function. The closer the accuracy value is to 1, the better the model is trained.

Loss function and inverse metric (accuracy) plots are valuable tools for visually analyzing the training process. The loss function plot helps to evaluate the convergence of training—how effectively the model learns with each epoch. The inverse metric plot, in turn, illustrates how close the model is to the optimal state, where reconstructive error is minimized. Figure 4 presents the loss function plot (Figure 4a) and the accuracy plot (Figure 4b).

The trained neural network is now applied to detect anomalies in a fGn X (t), where the first 10,000 values have a Hurst exponent of H = 0.9, and the subsequent 20,000 values exhibit a decrease in the Hurst exponent to H = 0.7. This change in the fractal structure of the time series X (t) serves as a test case to evaluate the model’s ability to detect anomalies, specifically those reflected in a sharp shift in the Hurst exponent. The complete time series X (t) is shown in Figure 5a, with Figure 5b highlighting the transition point between the two segments with different H values.

For the synthetic time series with a varying Hurst exponent H, the series Hx (t) was constructed using a sliding window method with the same parameters used during model training. This series, which captures the dynamics of the Hurst exponent, was centered and then fed into the trained neural network.

During the model’s operation, new input data are compressed into a latent representation via the encoder and then reconstructed by the decoder. The autoencoder calculates the reconstruction error for these data, where a high reconstruction error typically indicates the presence of anomalous segments in the data.

Our LSTM autoencoder model for anomaly detection in time series data was carefully monitored to ensure it did not overfit. Training was conducted on model-generated data, allowing us to precisely verify the detection of known anomalies. Additionally, we utilized normalized data to standardize the input and reduce the risk of overfitting. By validating the model’s performance on known anomaly cases, we confirmed that the model maintained its generalization capability and effectively detected anomalies without overfitting.

Figure 6a shows the time series Hx (t) with the varying Hurst exponent. Anomalous values in the series Hx (t) can be identified by the plot of the mean squared error obtained after model training (Figure 6b). The red line on the graph marks the threshold, beyond which the presence of anomalous data is indicated.

Thus, the neural network enables the identification of intervals with varying Hurst exponent values. In Figure 7a, the time series of Hurst exponents H_x (t) is shown, with anomalous values of the exponent indicated by red dots. Figure 7b displays the fGn series with intervals corresponding to different Hurst exponent values. Due to significant errors in some of the Hurst estimates, not all points were classified as anomalous.

It is well-established that EEG recordings exhibit self-similarity properties, which reflect the complex dynamic processes occurring in the brain. The degree of self-similarity can vary significantly depending on an individual’s physiological, psychological, and emotional state. Emotional states such as stress, anxiety, or joy manifest as changes in EEG signal characteristics and influence variations in the Hurst exponent [34]. These variations enable the use of EEG not only for diagnosing and monitoring different states of consciousness but also for assessing an individual’s emotional state.

Consider a dataset containing EEG recordings [35]. This dataset includes recordings from 16 volunteers who watched short video clips and documented their emotions for each segment. Thus, the dataset contains 32 different files with EEG recordings from multiple electrodes for one subject. Record length is 8000 samples. As a result, our sample comprises EEG recordings representing several emotional states, each of which can be characterized by distinct self-similarity metrics.

Consider one of the recordings from the dataset (Frontopolar electrode): an EEG sequence containing 8000 values (see Figure 8a). The first 500 values are displayed in Figure 8b.

The centered series of Hurst exponents, computed using the Whittle method with a sliding window and fed into the trained neural network model, is illustrated in Figure 9a. The corresponding plot of the mean squared error is shown in Figure 9b.

In Figure 10a, the Hurst exponent time series Hx (t) is shown, with anomalous values of H marked by red dots. Figure 10b displays the EEG time series with intervals corresponding to different values of H and, consequently, varying emotional states.

7. Conclusions

The research presented demonstrates the efficacy of using neural networks for detecting anomalies in time series with fractal properties, specifically by analyzing changes in the Hurst exponent. By applying a trained autoencoder model, it was possible to identify intervals where the Hurst exponent deviated significantly, indicating potential anomalies in the underlying data. This approach is particularly valuable for detecting structural changes in complex systems, where traditional methods might fail to capture subtle or gradual shifts. The neural network model, trained on synthetic fractal time series data, successfully identified regions with significant changes in the Hurst exponent, which correspond to anomalies in the time series.

The model’s applicability was further demonstrated on EEG data, where it was able to detect changes in brain activity associated with different emotional states. This underscores the potential of the model for real-world applications, particularly in biomedical signal processing.

The study also emphasizes the critical role of accurate Hurst exponent estimation in fractal analysis. The use of Whittle’s method for estimating the Hurst exponent proved effective, allowing the model to accurately identify changes in fractal properties.

It should be emphasized that the primary objective of this study was not to conduct a comparative analysis of various neural network architectures in the context of anomaly detection in time series. Such a comparative evaluation requires a specific methodological approach and opens up broad prospects for further scientific inquiry in this field. The integration of machine learning techniques, particularly neural networks, into the process of anomaly detection offers significant advantages in terms of automation and scalability. This approach is well-suited for the real-time monitoring and analysis of large datasets, enabling timely detection and response to anomalies.

Future research in this area is expected to focus on enhancing model accuracy through the optimization of neural network structures, which will enable more effective detection of anomalies in time series data. An important aspect of this ongoing research is also the development and implementation of more precise methods for estimating the Hurst exponent. This will ensure more reliable diagnostics and monitoring of the fractal properties of time series.

Author Contributions

Conceptualization, L.K. and S.Y.; methodology, L.K. and D.C.; software, Y.K.; validation, Y.K. and D.C.; formal analysis, L.K. and S.Y.; investigation, S.Y. and D.C.; resources, Y.K. and D.C.; writing—original draft preparation, S.Y. and Y.K.; writing—review and editing, L.K. and D.C.; visualization, Y.K.; supervision, L.K. and S.Y.; project administration, S.Y. and D.C. All authors have read and agreed to the published version of the manuscript.

Funding

The study is funded by the IMPRESS-U program within the framework of the project Modeling and forecasting the spread of infection in war and post-war period using epidemiological, behavioral and genomic surveillance data (2023/05/Y/ST6/00263).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Generated data and test tasks are used.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Feder, J. Fractals; Springer: New York, NY, USA, 2013. [Google Scholar]
Aguirre, J.; Viana, R.L.; Sanjuán, M.A.F. Fractal Structures in Nonlinear Dynamics. Rev. Mod. Phys. 2009, 81, 333–386. [Google Scholar] [CrossRef]
Gowrisankar, A.; Banerjee, S. Frontiers of fractals for complex systems: Recent advances and future challenges. Eur. Phys. J. Spec. Top. 2021, 230, 3743–3745. [Google Scholar] [CrossRef] [PubMed]
Mandelbrot, B.B.; Van Ness, J.W. Fractional Brownian Motions, Fractional Noises and Applications. SIAM Rev. 1968, 10, 422–437. [Google Scholar] [CrossRef]
Kantelhardt, J.W. Fractal and Multifractal Time Series. In Mathematics of Complexity and Dynamical Systems; Meyers, R., Ed.; Springer: New York, NY, USA, 2012. [Google Scholar] [CrossRef]
Brambila, F. (Ed.) Fractal Analysis—Applications in Physics, Engineering and Technology; InTech: London, UK, 2017. [Google Scholar] [CrossRef]
Pilgrim, I.; Taylor, R.P. Fractal Analysis of Time-Series Data Sets: Methods and Challenges. In Fractal Analysis; InTech Open: London, UK, 2018. [Google Scholar] [CrossRef]
Rao, P. Self-Similar Processes, Fractional Brownian Motion and Statistical Inference. Lect. Notes-Monogr. Ser. 2004, 45, 98–125. [Google Scholar] [CrossRef]
Peters, E.E. Fractal Market Analysis: Applying Chaos Theory to Investment and Economics; Wiley: New York, NY, USA, 2009. [Google Scholar]
Garcin, M. Forecasting with Fractional Brownian Motion: A Financial Perspective. Quant. Financ. 2022, 2, 1495–1512. [Google Scholar] [CrossRef]
Weron, A. Mathematical Models for Dynamics of Molecular Processes in Living Biological Cells: A Single Particle Tracking Approach. Ann. Math. Silesianae 2018, 32, 5–41. [Google Scholar] [CrossRef]
Lotte, F.; Bougrain, L.; Cichocki, A.; Clerc, M.; Congedo, M.; Rakotomamonjy, A.; Yger, F. A Review of Classification Algorithms for EEG-Based Brain–Computer Interfaces: A 10 Year Update. J. Neural Eng. 2018, 15, 031005. [Google Scholar] [CrossRef] [PubMed]
Craik, A.; He, Y.; Contreras-Vidal, J.L. Deep Learning for Electroencephalogram (EEG) Classification Tasks: A Review. J. Neural Eng. 2019, 16, 031001. [Google Scholar] [CrossRef] [PubMed]
Zhang, H.Y.; Feng, Z.Q.; Feng, S.Y.; Zhou, Y. A Survey of Methods for Estimating Hurst Exponent of Time Sequence. Available online: https://arxiv.org/pdf/2310.19051.pdf (accessed on 30 August 2024).
Shang, H.L. A Comparison of Hurst Exponent Estimators in Long-Range Dependent Curve Time Series. J. Time Ser. Econom. 2020, 12, 20190009. [Google Scholar] [CrossRef]
Hamza, A.H.; Hmood, M.Y. Comparison of Hurst Exponent Estimation Methods. J. Econ. Adm. Sci. 2021, 27, 167–183. [Google Scholar] [CrossRef]
Ivanisenko, I.; Kirichenko, L.; Radivilova, T. Investigation of Self-Similar Properties of Additive Data Traffic. In Proceedings of the CSIT 2015 Xth International Scientific and Technical Conference Computer Science and Information Technologies, Lviv, Ukraine, 14–17 September 2015; pp. 169–172. [Google Scholar] [CrossRef]
Ivanisenko, I.; Kirichenko, L.; Radivilova, T. Investigation of Multifractal Properties of Additive Data Stream. In Proceedings of the 2016 IEEE 1st International Conference on Data Stream Mining and Processing, DSMP 2016, Lviv, Ukraine, 23–27 August 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 305–308. [Google Scholar] [CrossRef]
Kirichenko, L.; Lavrynenko, R. Probabilistic Machine Learning Methods for Fractional Brownian Motion Time Series Forecasting. Fractal Fract. 2023, 7, 517. [Google Scholar] [CrossRef]
Kowalek, P.; Loch-Olszewska, H.; Szwabiński, J. Classification of Diffusion Modes in Single-Particle Tracking Data: Feature-Based versus Deep-Learning Approach. Phys. Rev. E 2019, 100, 032410. [Google Scholar] [CrossRef] [PubMed]
Li, X.; Yu, J.; Xu, L.; Zhang, G. Time Series Classification with Deep Neural Networks Based on Hurst Exponent Analysis. In Neural Information Processing; Springer International Publishing: Cham, Switzerland, 2017; pp. 194–204. [Google Scholar] [CrossRef]
Varun, C.; Arindam, B. Anomaly Detection: A Survey. ACM Comput. Surv. 2019, 52, 1–72. [Google Scholar] [CrossRef]
Andrzejak, R.G.; Lehnertz, K.; Mormann, F.; Rieke, C.; David, P.; Elger, C.E. Indications of Nonlinear Deterministic and Finite-Dimensional Structures in Time Series of Brain Electrical Activity: Dependence on Recording Region and Brain State. Phys. Rev. E 2001, 64, 061907. [Google Scholar] [CrossRef] [PubMed]
Radivilova, T.; Kirichenko, L.; Ageyev, D.; Tawalbeh, M.; Bulakh, V.; Zinchenko, P. Intrusion Detection Based on Machine Learning Using Fractal Properties of Traffic Realizations. In Proceedings of the 2019 IEEE International Conference on Advanced Trends in Information Theory (ATIT), Kyiv, Ukraine, 18–20 December 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 218–221. [Google Scholar] [CrossRef]
Radivilova, T.; Kirichenko, L.; Lemeshko, O.; Ageyev, D.; Mulesa, O.; Ilkov, A. Analysis of Anomaly Detection and Identification Methods in 5G Traffic. In Proceedings of the 2021 11th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), Kraków, Poland, 22–25 September 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1108–1113. [Google Scholar] [CrossRef]
Khlamov, S.; Savanevych, V. Big Astronomical Datasets and Discovery of New Celestial Bodies in the Solar System in Automated Mode by the CoLiTec Software. In Knowledge Discovery in Big Data from Astronomy and Earth Observation, 1st ed.; Part IV, Chapter 18; Elsevier: Amsterdam, The Netherlands, 2020; pp. 331–345. [Google Scholar] [CrossRef]
Radivilova, T.; Kirichenko, L.; Alghawli, A.S.; Ilkov, A.; Tawalbeh, M.; Zinchenko, P. The Complex Method of Intrusion Detection Based on Anomaly Detection and Misuse Detection. In Proceedings of the 2020 IEEE 11th International Conference on Dependable Systems, Services and Technologies (DESSERT), Kyiv, Ukraine, 14–18 May 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 133–137. [Google Scholar] [CrossRef]
Banna, O.; Mishura, Y.; Ralchenko, K.; Shklyar, S. Fractional Brownian Motion; John Wiley & Sons: Hoboken, NJ, USA, 2019. [Google Scholar] [CrossRef]
Abd-Alhamid, F.; Kent, M.; Calautit, J.; Wu, Y. Evaluating the Impact of Viewing Location on View Perception Using a Virtual Environment. Build. Environ. 2020, 180, 106932. [Google Scholar] [CrossRef]
A Gentle Introduction to LSTM Autoencoders. Available online: https://machinelearningmastery.com/lstm-autoencoders/ (accessed on 30 August 2024).
Qais, M.H.; Kewat, S.; Loo, K.H.; Lai, C.-M.; Leung, A. LSTM-Based Stacked Autoencoders for Early Anomaly Detection in Induction Heating Systems. Mathematics 2023, 11, 3319. [Google Scholar] [CrossRef]
Lachekhab, F.; Benzaoui, M.; Tadjer, S.A.; Bensmaine, A.; Hamma, H. LSTM-Autoencoder Deep Learning Model for Anomaly Detection in Electric Motor. Energies 2024, 17, 2340. [Google Scholar] [CrossRef]
Rao, A.R.; Wang, H.; Gupta, C. Functional approach for Two Way Dimension Reduction in Time Series. In Proceedings of the 2022 IEEE International Conference on Big Data (Big Data), Osaka, Japan, 17–20 December 2022; pp. 1099–1106. [Google Scholar] [CrossRef]
Sourina, O.; Liu, Y. A fractal-based algorithm of emotion recognition from EEG using arousal-valence model. In Proceedings of the International Conference on Bio-Inspired Systems and Signal Processing (BIOSIGNALS-2011), Rome, Italy, 26–29 January 2011; pp. 209–214. [Google Scholar]
EEG Dataset EEG-Emotion-Classification. Available online: https://www.kaggle.com/datasets/samnikolas/eeg-dataset (accessed on 19 September 2024).

Figure 1. The neural network architecture used in the experiment.

Figure 2. Time series fGn for H = 0.7.

Figure 3. Time series of Hurst exponent estimates Hx(t).

Figure 4. Plot of the loss function (a) and model accuracy (b).

Figure 5. Time series with changing Hurst exponent: full 30,000 values (a); interval where the change in H occurs (b).

Figure 6. Time series Hx (t) (a) and mean squared error (b).

Figure 7. Detected anomalies in the Hurst exponent series (a) and the corresponding intervals of fractal Gaussian noise (b).

Figure 8. EEG with varied emotional states: full recording (a); 500 elements (b).

Figure 9. Time series Hx (t) for EEG (a) and mean squared error (b).

Figure 10. Detected anomalies in Hx (t) (a) and corresponding intervals in the EEG (b).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kirichenko, L.; Koval, Y.; Yakovlev, S.; Chumachenko, D. Anomaly Detection in Fractal Time Series with LSTM Autoencoders. Mathematics 2024, 12, 3079. https://doi.org/10.3390/math12193079

AMA Style

Kirichenko L, Koval Y, Yakovlev S, Chumachenko D. Anomaly Detection in Fractal Time Series with LSTM Autoencoders. Mathematics. 2024; 12(19):3079. https://doi.org/10.3390/math12193079

Chicago/Turabian Style

Kirichenko, Lyudmyla, Yulia Koval, Sergiy Yakovlev, and Dmytro Chumachenko. 2024. "Anomaly Detection in Fractal Time Series with LSTM Autoencoders" Mathematics 12, no. 19: 3079. https://doi.org/10.3390/math12193079

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Anomaly Detection in Fractal Time Series with LSTM Autoencoders

Abstract

1. Introduction

2. Background

3. Study Objectives

4. Materials and Methods

4.1. Fractional Brownian Motion and Fractional Gaussian Noise

4.2. Whittle’s Method

4.3. Choice of Neural Networks

5. Computational Experiment: Study of Changes in the Hurst Exponent Using LSTM Autoencoders

6. Results and Discussion

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI