1. Introduction
Epilepsy seizure, characterized by abnormal neuronal activities in the human brain [
1], is a common disease in the brain‘s nervous system and seriously threatens normal life and even the safety of millions of patients around the world [
2]. Epileptic patients may suffer from uncontrolled jerking movements, loss of consciousness, and other painful symptoms, which may result in unexpected death if patients are not treated in time [
3]. Electroencephalogram (EEG) signals have been proven to be the golden tool to detect and analyze epilepsy [
4]. With the development of the wearable EEG signal recording system [
5,
6], computer-aided techniques, and the Internet of Medical Things (IoMT) devices, continuously monitoring the EEG signal of epileptic patients is a promising method to provide real-time epilepsy monitoring to avoid catastrophic accidents. The common method for extracting physiological and disease information from the EEG waveforms is manually checking the long-term recordings and analyzing the wave morphology by experienced neurologists or doctors, which is time-consuming, labor-intensive, experience-dependent and inefficient for real-time and long-term epilepsy detection. Automatically recognizing EEG signals with high efficiency is the bottleneck for real-time automatic epilepsy detection.
Recently, many machine learning (ML) methods, including support vector machine (SVM) [
7], decision tree [
8], naïve Bayes, and KNN [
9], have been explored for automatically detecting epilepsy with the feature extracted from EEG signals in the time domain [
10], frequency domain [
11] and time–frequency domain [
12]. He et al. [
7] compared the performances of support vector machine (SVM) and gradient boosting decision tree (GBDT) on automatic EEG signal recognition using empirical mode decomposition (EMD)-based time domain features, a nonlinear power spectrum density (PSD)-based feature. The results demonstrated that the GBDT achieved better performance. Albaqami et al. [
8] utilized wavelet packet decomposition (WPD) to decompose the EEG signals and extracted the statistical features from the selected coefficients. In this research, the features aggregation algorithm was employed to reduce the dimension, and the extracted features were fed into a GDBT classifier. The results showed that the proposed method based on WPD and GBDT classifier exhibited higher accuracy and sensitivity than existing techniques. Wang et al. [
9] combined the weighted KNN classifier and Bray Curtis distance to realize automatic epilepsy detection and the results suggested that the proposed method improved the prediction accuracy and reduced the false alarm rate. Xu et al. [
13] investigated the GBDT classifier and nonlinear features including entropy, sample entropy, permutation entropy, spectral entropy, and wavelet entropy for epilepsy prediction, and the results showed that the proposed method could precisely recognize the two categories of EEG signal.
It can be clearly seen that automatic epilepsy detection based on ML algorithms can be divided into two procedures, namely feature extraction, and classifier design. Many studies have been dedicated to exploring stylish classification algorithms and complicated feature extraction methods to improve prediction accuracy, while few of them took the computational cost into consideration, which impedes those methods for real-time epilepsy detection based on a wearable device with limited computing power [
14].
The features of EEG data could be extracted in the time domain, frequency domain, and time–frequency domain. To be specific, the features extracted from raw data in the time domain (e.g., amplitude, entropy, etc.) possess the advantages of easy implementation and high computing efficiency, while the traditional time domain methods may lose significant epileptic details contained in the EEG signals. The characteristics of EEG signals can also be described in the frequency domain using signal transformation, and the most widely used frequency domain feature extraction approach is PSD. The PSDs of different kinds of EEG signals (health, interictal, and ictal signals) from an EEG dataset of the University of Bonn were demonstrated to be significantly different from each other [
15]. Obtaining features individually from the time domain or frequency domain is inadequate in fully mining the information. Many studies transfer the one-dimensional raw EEG data into two-dimensional time–frequency images to extract the hidden features of the EEG signals for ML-based signal recognition. Ozdemir et al. [
16] applied a Fourier-based synchrosqueezing transform (SST) with high resolution in time–frequency domain to convert the raw data into time–frequency images and a convolutional neural network (CNN) was utilized for signal classification. The proposed SST-based CNN method predicted epilepsy seizures with satisfactory accuracy. Although the features extracted from the frequency and time–frequency domain could provide abundant knowledge about epilepsy, those methods require a transformation process. Moreover, the convolution processing of time–frequency images for feature extraction and further image classification requires high computing power, especially GPU devices. Those factors make the feature extraction methods in frequency and time–frequency domains incongruous for real-time epilepsy detection. Therefore, the question of how to extract significant information from EEG signals in time domain with high efficiency is open-ended and yet to be answered.
Many successful applications of ML methods based on EEG signal recognition have demonstrated their great potential and superiority in automatic epilepsy detection. However, model interpretability, which is critical for training convergence and generalization of the model [
17], is habitually ignored by existing publications, which hinders wider applications of ML methods in epilepsy detection. Enhancing the model interpretability is significant for promoting the user’s understanding and predictive performance of ML-based epilepsy detection.
In this research, a novel epilepsy detection method based on feature extraction using a deep autoencoder (AE) without time–frequency transformation is proposed driven by EEG signals. AE-based features are employed as inputs to three typical ML classifiers to validate the effectiveness and superiority of the proposed AE-based time domain feature extraction method in epilepsy detection. Principal component analysis (PCA)-based features are fed into the same classifiers for comparison. Both feature distribution analysis and model interpretability (permutation importance analysis and SHapley Additive exPlanations (SHAP) method) were conducted to understand the underlying mechanism and to explain the superiority of the AE-based feature extraction method. This study aims to advance the ML-based method in real-time epilepsy detection using EEG signals.
2. Methodology
In the proposed method, a certain type of EEG signal is utilized for decoding and encoding in the training of the AE model. Thereupon, when other types of signals are used as inputs as a test, the output reconstruction errors will exhibit a corresponding discrepancy since the trained AUE model is learned from a specified type of EEG characteristics. On this basis, the sensitive features can be extracted through the quantification of reconstruction error, which is further allocated for downstream classifier training.
The proposed AE is first utilized to reconstruct the EEG signals in the time domain. The indicators for the signal reconstruction quantification error are employed as the epilepsy-sensitive features, and the distribution patterns of the obtained features are presented to preliminarily investigate the effectiveness of AE-based extracted features. The obtained features are fed into three typical classifiers as predictor training for epilepsy detection. Finally, model interpretability analysis is conducted to figure out the effects of the different parameters on the model and to explain the superiority of the proposed method. The overall flowchart of this research is shown in
Figure 1.
2.1. Dataset
The EEG dataset from the University of Bonn [
15], a publicly available dataset, was utilized to verify the feasibility and effectiveness of the AE-based feature extraction method for automatic epilepsy detection. The dataset consists of five subsets, and each subset contains 100 single-channel EEG signals. The duration of each signal is 23.6 s and the sampling frequency is 173.6 Hz. Signals in sets A and B were recorded from five healthy volunteers with eyes open and eyes closed, respectively. In subsets C and D, the EEG data were measured from five epileptic patients during interictal intervals. The data in subset (E) were obtained from the five patients when active seizures occurred. In this research, the signals from set A, set D, and set E in the dataset were selected and utilized to represent the health, interictal and ictal EEG signals. To improve the generality of downstream models, the raw EEG signals were first normalized into the range 0 to1. The typical normalized signal samples of the three subsets are shown in
Figure 2.
2.2. AE
AE, constituting decoder and encoder, is a kind of unsupervised neural network for distilling the immanent characteristics obscured in the raw data. Specifically, the encoder would encode time series signal
x as a latent space vector
h, while the decoder reconstructs the date from the latent space vector. The process could be expressed as:
where
φ1 and
φ2 are the activation functions of the encoder and decoder,
W and
are the weight matrix,
is the reconstructed signal from AE, and
b1 and
b2 are the bias between the original input and the reconstructed ones obtained by the AE.
The encoder optimizes the weight matrix and bias matrix through backpropagation to minimize the reconstructed error. Mean squared error (MSE) is the most common loss function and could be expressed as:
where
N is the dimension of the input, and
x and
are the input and output of the
ith iteration, respectively.
As shown in
Figure 3, AE in this study consisted of a series of stacked hidden layers for reconstructing the EEG signals. Specifically, layers E1 to E3 are the encoding module for compressing the input data into the latent space vector. Layer D1 to D3 represent the decoding module, which reconstructs the input data from the latent space vector. Each hidden layer is made up of a convolutional layer and an activation layer (ReLU). The loss function is defined as the MSE, and the learning rate of 0.001 is employed in the Adam optimizer to update the network weight matrix based on the training data.
In this study, the AE, trained using EEG signals from the interictal condition, is well-prepared for calculating the epilepsy-sensitive features in the time domain when the monitored EEG data were inputted into the primed AE, which could then enhance the efficiency of feature extraction. Moreover, the AE has demonstrated a remarkable ability to preserve the essence of the input EEG signals while eliminating noise, which may make it a superior tool to maintain more information than the typical dimension reduction method (e.g., PCA). From the perspective of computational efficiency and information preservation, the AE is therefore explored to extract features from EEG signals in the time domain for the automatic and real-time detection of epilepsy.
Three typical features including MSE, original-to-reconstructed signal ratio (ORSR), and cosine similarity (CS) are respectively defined to quantify the signal reconstruction and are further used as epilepsy-sensitive features. MSE is the regular indicator to quantify the difference between two signals, and could be expressed as:
where
n is the length of the original data, the higher MSE value indicates a larger reconstructed error while the lower MSE value reveals a smaller reconstructed bias. MSE has demonstrated its effectiveness in measuring the signal difference in many fields [
18] and is employed in this study.
Inspired by the signal-to-noise ratio, ORSR is defined as the ratio between the amplitude of the original and reconstructed signals, and could be expressed as:
CS could quantify the signal similarity by mapping the data in the vector domain and calculating the cosine value between vectors. Higher cosine values suggest a lower similarity and vice versa. The cosine value of two vectors could be obtained as:
The discrete expression form of the equation could be expressed as:
The mentioned indicators including MSE, ORSR, and CS are integrated as the input for representing the deep features of the EEG data and the automatic epilepsy detection in a data-driven manner.
2.3. Supervised Machine Learning Classifiers for Epilepsy Detection
With the rapid development of computational power, many studies have been conducted to achieve the goal of automatic epilepsy detection based on ML algorithms. To demonstrate the effectiveness and superiority of the AE-based features in EEG signal recognition, three widely used ML classifiers including the random forest (RF) classifier, the AdaBoost classifier, and the Gradient Boosting classifier are employed in this research. The working principles of the three classifiers are summarized as follows.
2.3.1. RF Classifier
RF classifier [
19] is based on an integrated algorithm of decision tree theory and consists of multiple decision tree classifiers. Each individual decision tree generates a prediction result, and the final prediction is performed by implementing the major voting method, as shown in
Figure 4. The RF classifier has been widely utilized to classify EEG signals for evaluating the stages of sleep and diagnosing sleep problems [
20] and identifying landscape perception [
21] due to the good balance between execution time and reliability. Therefore, RF is employed to classify the EEG signals for the evaluation of the performance of the features extracted from the AE.
2.3.2. AdaBoost Classifier
AdaBoost, namely adaptive boosting, possesses the adaptive characteristic of focusing on well-predicted samples. AdaBoost could be regarded as a training framework that could employ any classification algorithm and improve performance through a combinational approach. The basic principle and procedures are shown in
Figure 5.
2.3.3. Gradient Boosting Classifier
The Gradient Boosting classifier is a prevalent classification algorithm based on decision tree. This classifier has been widely utilized in EEG signal recognition [
7,
22] because of the advantages it has in reducing the over-fitting problem and has high prediction accuracy. The workflow of the Gradient Boosting classifier is shown in
Figure 6.
The three typical ML classifiers were utilized to investigate the performance of the AE-based features in EEG signal recognition and automatic epilepsy detection.
5. Discussions
The feature extraction of EEG signals with high efficiency is the critical step in automatic epilepsy detection using ML-based methods. Most studies have paid much attention to the improvement of prediction accuracy using sophisticated signal processing methods, and few of them focused on obtaining the features with a high computation efficiency and a high degree of information preservation. In this research, the deep AE was proposed to extract the features in the time domain to avoid the transformation process and to ensure computing efficiency. Meanwhile, AE-based features preserved more significant EEG signal information than the PCs obtained from PCA. The obtained parameters were then utilized to train the classification models and predict epilepsy. The feature distribution and interpretability model were then performed to advance the understanding of the users in the trained classification model.
A limitation of this study might have been that the analysis was mainly conducted on the EEG dataset from the University of Bonn, and the generalization of the AE-based feature extraction method for automatic epilepsy detection needs to be further studied. Three features including CS, ORSR, and MSE were extracted in the time domain based on the proposed AE, and more prevalent time domain features of the EEG signals (e.g., amplitude, zero-crossing rate, etc.) need to be investigated in epilepsy detection.