1. Introduction
Epilepsy as a neurological condition impacts over 50 million individuals globally [
1], which is characterized by abnormal increases in brain electrical activity, leading to symptoms such as loss of attention, hallucinations, and convulsions [
2]. The consequences of epileptic seizures can be severe, impacting physical health, mental well-being, and social relationships. These effects include loss of consciousness, potential injuries, and, in extreme cases, even the risk of sudden death. The acquisition of epileptic episodes through electroencephalogram (EEG) provides real-time, cost-effective, and non-invasive information with exceptional spatio-temporal resolution [
3]. Detecting seizures in EEG signals plays a critical role in diagnosing and treating epilepsy. Depending on the extent of brain area involvement during a seizure, epilepsy can be classified into two main types: focal seizures, which affect specific brain areas, and generalized seizures, which involve both sides of the brain.
The neuronal signals, originating from brain activity, are captured by electrodes positioned on the scalp’s surface. [
4]. These electrodes are placed on the scalp based on the configuration proposed in the international 10–20 system of electrode placement [
5], as depicted in
Figure 1. Electrodes are identified based on their locations: frontal polar (Fp), frontal (F), central (C), parietal (P), temporal (T), occipital (O), and auricular (A), with even numbers on the right hemisphere and odd numbers on the left hemisphere, and with “z” denoting the midline. Additionally, electrodes such as nasion (nose), inion (nape), and auricular points are employed as reference points.
EEG signals belong to the complex domain, wherein each channel associated with an electrode measures the sum of electrical impulses from the cerebral cortex, originating from the activity of billions of neurons proximal to the electrode [
6]. EEG signals provide temporal, spatial, and spectral information. The frequency range of EEG signals typically varies between 0.5 and 100 Hz, delineating five principal bands associated with brain and body states, identified by the Greek letters delta (0.5–4 Hz), theta (4–8 Hz), alpha (8–13 Hz), beta (13–30 Hz), and gamma (30–100 Hz) [
7]. During an epileptic seizure, hypersynchronization of nervous signals leads to a significant increase in signal amplitude across the bandwidth. This increase is observable in three phases: preictal, ictal, and interictal, corresponding to the periods preceding the seizure, during the seizure, and the period between two attacks, respectively [
8].
Figure 2 shows an EEG signal with an epileptic seizure occurring in the interval 2996–3036 s, as annotated by neurophysiologists. The beginning and end of the seizure period were delineated, along with the respective preictal and postictal periods.
Classifying EEG signals presents a significant challenge due to the dynamic nature of the signals and the diverse patterns of seizures observed across patients and recording sessions. Additionally, EEG data acquisition systems encounter obstacles due to their susceptibility to various forms of interference, for example, muscle movements, blinking, and ambient background noise. While visual analysis of EEG signals can identify epileptic crises, this manual process is slow, costly, and prone to human errors. The increasing demand for more accurate and efficient diagnoses has prompted researchers to develop algorithms for the automatic detection of these crises.
This study introduces a model for detecting epileptic attacks in EEGs based on the classification of alpha and beta brainwaves during intervals with and without crises. In this study, a Savitzky–Golay filter is employed to increase the signal-to-noise ratio (SNR) and alpha and beta waves are derived through the application of discrete wavelet transform (DWT). Features from every EEG signal epoch are then extracted through statistical analysis and they are classified as seizure or non-seizure using a linear support vector machine (SVM). Algorithm parameters are adjusted to maximize performance and precision in detecting epileptic crises.
Some key aspects this work brings forth are (1) the implementation of noise reduction techniques in the signals, which is crucial for training precise models in EEG analysis. (2) Signal decomposition, significantly reducing data dimension and thus optimizing computational costs for more efficient processing. (3) Detailed exploration and comparison of alpha and beta band classifications, not only enabling detection in individual channels but also extending the analysis scope to single frequency bands, enhancing the versatility of the analysis.
The article is arranged into the following sections:
Section 2 provides a thorough review of related works,
Section 3 introduces the proposed approach,
Section 4 presents the databases, results, performance, and discussion, and ultimately,
Section 5 concludes by presenting findings and outlining future research directions.
2. Related Works
The realm of epileptic seizure detection in EEG signals has been considered as a significant progression in recent years, which is characterized by the introduction of novel techniques and approaches rooted in machine learning (ML). A comprehensive review of the literature unveils a diverse range of methodologies utilized in studying epileptic seizures in EEG signals, all united by the shared goal of enhancing diagnostic accuracy while streamlining analysis resources and time. In the following, a summary is presented regarding the latest advancements in this field, encompassing both recent advancements and pertinent classical research.
Deep learning (DL) models have become widely used in EEG signal analysis due to their high accuracy. Convolutional neural networks (CNNs), in particular, are favored for their pooling layers that efficiently reduce data dimensionality and automate feature selection. This makes them a preferred choice among researchers. Despite their advantages, DL models are complex. They require substantial computational resources, involve numerous nonlinear functions, and need large datasets for effective generalization. Training these models can be time-consuming and involves careful tuning of parameters to avoid overfitting. Additionally, the inner workings of hidden layers can be opaque, posing challenges for implementation in simpler applications.
To mitigate some of these challenges, [
9] used a one-dimensional CNN (1D-CNN) with a single convolutional layer, achieving an impressive accuracy of 99.4%. Similarly, [
10] utilized time and frequency domain signals in a three-layer CNN, reaching 97.5% accuracy. [
11] achieved 92.37% accuracy using 2D CNNs for single electrodes and 3D CNNs for images derived from multiple electrodes.
Recurrent neural networks (RNNs) have also proven effective in capturing the temporal dynamics of EEG signals. Long short-term memory (LSTM) networks and their variants, such as Bidirectional LSTM (BiLSTM), gated recurrent unit (GRU), and fully convolutional nested LSTM (FC-NLSTM), have shown remarkable accuracies of 99.3%, 100%, and 100%, respectively, as reported by [
12,
13,
14]. The authors agree that these models, despite their high accuracy, require substantial computational resources.
Transfer learning, exemplified by [
15] combining networks like ResNet and Inception, achieves 100% accuracy. However, optimizing these large models for computational efficiency remains challenging due to their complex hidden layers. Another high-performing model is proposed by [
16], utilizing a recurrent topic-synchronized variational autoencoder (RTS-RCVAE) with an accuracy of 98.43%, albeit at significant computational expense.
Several studies have integrated DL techniques with signal processing, incorporating manual feature extraction. Many of these studies utilize Wavelet Transform (WT) for signal decomposition. The objective of employing these techniques is to reduce signal dimensionality, extracting signal representations across different frequency bands, thereby reducing the volume of data processed by DL models. Models combining WT with DL have consistently attained high accuracy performance. For instance, [
17] utilized multiresolution Wavelet Transform (WT) in conjunction with an artificial neural network (ANN), achieving an accuracy of 99.6%. [
18] explored continuous Wavelet Transform (CWT) for statistical feature extraction and classification using an LSTM, achieving 100% accuracy. Another model proposed by [
19] employed Wavelet Packet Decomposition (WPD) and Fast Fourier Transform (FFT), reaching an accuracy of 98.33% through CNN training. The approach of combining Discrete Wavelet Transform (DWT) with LSTM training, proposed by [
20], obtained 96.1% accuracy in classifying statistical features. In [
21], Tunable-Q Wavelet Transform (TQWT) was applied to signals to compute multiple linear and nonlinear features, including statistical, frequency, and nonlinear aspects such as fractal dimensions (FDs) and entropy. A classification method combining ML and DL, with a CNN-RNN-based DL model, gained an accuracy of 99.71%. Regarding implementation costs associated with WT algorithms, all cases demonstrate efforts to reduce the data volume input into complex DL models, either through signal decomposition or feature extraction.
In the field of traditional ML, various signal and data processing techniques have been integrated with lightweight classifiers. The binary SVM classifier is particularly favored for its creation of models where the decision function is a simple linear function with coefficients equal to the number of features being classified. Linear and nonlinear kernels such as RBF and Polynomial have been commonly employed in classifications, with the latter two incurring higher computational costs compared to the linear kernel.
Studies utilizing SVM-based feature classification techniques have employed signal decomposition methods in time-domain subbands, including DWT [
22], Hilbert–Huang Transform (HHT) with Empirical Wavelet Transform (EWT) [
23], Discrete Cosine Transform (DCT) [
24], and Conditional Mutual Information Maximization (CMIM) feature extraction [
25], achieving high accuracy levels of 95.6%, 100%, 97%, and 99.83%, respectively. In [
26], the SVM approach was compared with algorithms like K-nearest neighbors (KNN) and Linear Discriminant Analysis (LDA) in the classification of third-order tensors obtained from HHT and WT, represented by Canonical Polyadic Decomposition (CPD) and Block Term Decomposition (BTD), achieving performance exceeding 98%.
In [
27], an approach based on KNN classified histograms of Multilevel Local Patterns (MLP) obtained through Empirical Mode Decomposition (EMD) and Intrinsic Mode Functions (IMF) was used, yielding an accuracy of 98.67%. Finally, [
28] applied a five-level EWT to decompose EEG signals, utilizing time-frequency features in real-time and further processed classification outcomes, achieving a mean accuracy of 99.88%.
ML algorithms, unlike DL models, necessitate manual feature extraction. Nonetheless, results indicate that these procedures do not significantly impair model performance. Furthermore, controlling the number of input features allows for the creation of simpler models with a small number of representative examples per class. Overall, these models, due to their low data volume requirements for training, better fulfill the computational cost reduction needs.
4. Results and Discussion
4.1. Analysis of EEG Signals
The EEG signal analysis experiments conducted in this study has been executed using MATLAB software, version 2023b. The experiments were executed on a Lenovo ThinkPad T430 laptop (Lenovo, Beijing, China), equipped with an Intel Core i5 processor and 16 GB of RAM, OS Windows 10, 64 bit. Twenty EEG files were selected from each of the Helsinki Hospital and Boston Hospital databases. The signals presented periods labeled with and without epileptic seizures. The EEG montage configurations used in recording each dataset differed, leading to the creation of a custom SVM feature classification model for each database. Due to the spatial nature inherent in EEG signals, certain channels demonstrated superior performance relative to others in the experiments. The preprocessing of signals, signal decomposition, feature extraction, and feature classification were performed on EEG signal intervals demarcated as seizure or non-seizure.
4.2. Datasets
The EEG seizure record databases, which are considered in this study, are publicly available and were recorded at the Helsinki University Hospital and Boston Children’s Hospital. EEG signals were labeled by neurophysiologists as either indicative of seizure or non-seizure activity, marking the commencement and conclusion of epileptic episodes. Both databases were sampled at a frequency of 256 Hz, adhering to the electrode placement guidelines outlined by the international 10–20 system.
4.2.1. A Dataset of Neonatal EEG Recordings with Seizure Annotations
The EEG recordings in this database were acquired between 2010 and 2014 at the neonatal intensive care unit (NICU) of Helsinki University Hospital [
37]. The recordings were obtained from 79 newborn patients aged 32 to 45 weeks. The recordings have durations ranging from 64 to 96 min and utilize 19 channels.
4.2.2. CHB-MIT Scalp EEG Database
The database was obtained in 2010 at children’s hospital Boston (CHB-MIT) [
38]. EEG signals were recorded from 22 patients with intractable seizures, and aged between 1.5 and 22 years. The duration of EEG signals was standardized to 60 min for the entire dataset, utilizing 23 channels. A European data format (EDF) file from the CHB-MIT dataset, along with its signals, is depicted in
Figure 4.
4.3. EEG Signal Filtering with SGF
The parameters of the Savitzky–Golay filter were chosen experimentally through the following steps. Initially, a test signal was deliberately exposed to varying degrees of Gaussian white noise, in ranges of −15, −10, −5, 0, 5, 10, and 15 dB. The signals were filtered with all possible combinations of filter parameters, with order and frame length ranging from 3 to 41. The filter parameters were selected based on their capacity to elevate the signal-to-noise ratio (SNR) of a signal contaminated with −15 dB white Gaussian noise (WGN) to a level of −3 dB or better. Moreover, these parameters were chosen to ensure a cutoff frequency of at least 60 Hz, which is at least twice the highest measured fundamental frequency in the beta band (30 Hz). This choice adheres to the Nyquist criterion for signal sampling. Finally, the correlation between the test signal before filtering and the filtered output was computed for each parameter configuration identified in the previous step. These tests revealed that the filter with parameters order
and frame length
achieved the best performance in terms of SNR improvement. The outcomes regarding performance are shown in
Table 1. The cutoff frequency of the filter was estimated by constructing spectrograms of the signal contaminated with white Gaussian noise at an SNR of 15 dB and of the signal at the filter’s output. The spectrograms in
Figure 5 illustrate how the filter’s cutoff frequency fluctuates between 60 and 70 Hz.
Finally, the filtering process applied to the original signal without added noise demonstrated its efficacy, with the SGF 22–35 filter achieving the highest correlation coefficient of 0.98181, which suggests substantial similarity between both signals.
4.4. Wavelet Signal Decomposition
The filtered signals underwent wavelet transformation based on the Mallat algorithm [
39].
Figure 6 illustrates the decomposition diagram of the employed filter bank. The signal was filtered using high-pass filters (HPF) to calculate detail coefficients (D) and low-pass filters (LPF) to compute approximation coefficients (A). Additionally, to reduce dimensionality at the output of each filter, the signal was downsampled by a factor of 2. In this process, the Daubechies 8 (db-8) wavelet function was chosen due to its effective balance between time and frequency localization.
The signals passed through the wavelet filter bank, where wavelet coefficients were constructed in the alpha (8–13 Hz) bands, associated with states of conscious relaxation [
8], and beta (13–32 Hz) bands, attributed to motor functions [
6]. In the alpha band, we utilized coefficients A7 and A11 at decomposition levels 4 and 5, while for the beta band, we used coefficients D9, D11, A6, A8, and A10 at decomposition levels 3 and 4. The number of samples per each 1 s epoch was reduced from 256 to 32 for the beta band and to 16 for the alpha band.
Delta and theta bands were not employed because these bands typically appear in infants, while in adults they only appear during sleep or indicate the presence of other disorders when present during wakefulness [
7]. Moreover, the gamma band, related to cognitive brain functions, is not used in this work as it requires the combination of several wavelet coefficient levels 2, 3, and 4, with level 2 operating at 64 samples per second. Our focus is on reducing the computational complexity, which is why the alpha band at 16 samples per second with two wavelet coefficients and the beta band at 32 samples per second with five wavelet coefficients better meet this requirement.
4.5. Feature Extraction
From the alpha and beta wavelet coefficients, intervals labeled were selected as seizure and non-seizure. Mean, variance, skewness, kurtosis, entropy, and energy metrics were computed for sample sets within one-second epochs. Additionally, intermediate features were extracted between two epochs or with overlapping of half-epochs. By using this, interactions between each epoch were also considered.
4.6. SVM-Based Feature Classification
Feature sets were created for seizure and non-seizure classes, ensuring balanced data representation to mitigate data imbalance. These sets were utilized to train a binary SVM classifier with a linear kernel and penalty parameters (C). The linear kernel was chosen for its computational simplicity, as it only requires the dot product, unlike more complex kernels such as RBF or Polynomial. Additionally, the dataset exhibited a quasi-separable distribution between the two classes, with the sets being reasonably well-defined on both sides of the hyperplane, despite some overlap in the middle. The parameter C values were determined through cross-validation training on one-fifth of the total dataset, considering a loss threshold between 1% and 5% of the total classified features during validation. Higher C values reduced error rates but also led to SVM models prone to overfitting, resulting in diminished generalization capability.
Models were trained separately for the alpha and beta bands, significantly reducing the amount of computations and processing time to assess the performance of each one. Specifically, the processing time for the alpha band was shorter than the beta band because the alpha band was constructed with 16 samples, whereas the beta band was constructed with 32 samples.
Model performance was evaluated on a separate test dataset. The class of each test feature vector was determined by applying the SVM’s decision function, classifying a negative output as non-seizure and a positive output as seizure. Three SVM models were generated: the first for the dataset from CHB-MIT, the second for the dataset from Helsinki hospital without pre-filtering, and the third for the pre-filtered dataset from Helsinki hospital. The aim of models 2 and 3 was to compare the performance of the model trained on EEG signals without pre-filtering and filtered signals using the Savitzky–Golay (frame length 35, order 22). This was intended to evaluate how effectively this setup can enhance model performance. The parameters were chosen through extensive trial and error due to the risk of information loss with improper Savitzky–Golay filter settings, potentially resulting in poorer performance compared to models trained on unfiltered signals.
4.7. Model Classification Performance
The evaluation of the generated models involved analyzing sensitivity, specificity, and precision metrics. Sensitivity (recall) quantifies the model’s capacity to accurately detect true positives
in relation to all actual positives
, where
are the false negatives. It evaluates the model’s proficiency in detecting positive cases accurately. The mathematical expression for recall is expressed as:
Specificity quantifies the model’s ability to accurately identify true negatives
among all actual negatives
, where
represents the false positives, thereby assessing its capability to reduce false alarms in negative case identification. The specificity is given by:
Precision (accuracy) denotes the proportion of correct predictions (both positive and negative) made by the model relative to all predictions made (
+
+
+
). Accuracy represents an overall measure of the model’s precision. The equation to obtain accuracy is expressed as:
In
Table 2, the results of calculating recall (Sen), specificity (Spec), and accuracy (Acc) are presented for the three classifier models across the alpha and beta frequency bands. The signal number is indicated in the EDF file that exhibited the best performance in classification. This primarily stems from the fact that certain seizures occur in specific channels and may not be detected throughout the entire cerebral cortex. Additionally, the number of seizures recorded in each EDF file by neurophysiologists is also reported. The model trained using data from the CHB-MIT dataset achieved an average accuracy of 90.3% for the alpha band and 89.7% for the beta band. The classification results are detailed in
Table 2. A graphical comparison of the accuracy achieved by the CHB-MIT model for alpha and beta bands is depicted in
Figure 7.
The model achieved high accuracy rates for each frequency band separately, demonstrating the algorithm’s capability to detect seizures in EEG signal representations with significantly dimension reduction and data volumes, even from a single channel.
In relation to the classification models developed for the Helsinki Hospital database, the accuracy achieved by the unprocessed model was 73.35% in the alpha band and 70.13% in the beta band. Conversely, the average accuracy achieved by the preprocessed model was 92.82% in the alpha band and 90.55% in the beta band. These results underscore the importance of preprocessing in EEG signals.
Table 3 and
Table 4 show the performance obtained for the models constructed with unfiltered and prefiltered databases, respectively.
A visual comparison of the performance from the Helsinki hospital models for the alpha and beta bands is presented in
Figure 8. It is observed that models constructed with prefiltered data exhibit significantly superior performance compared to those built with unfiltered data. Thus, it can be concluded that prefiltering notably improves the reliability of EEG signals.
Regarding processing speed, analyzing a 3600-s signal in the alpha band took an average of 70 s, indicating a processing time of 0.019444 s per second of signal. For the beta band, analyzing a signal of the same length took an average of 250 s, resulting in a processing time of 0.69444 s per second of signal. This result is evident given that the alpha band uses 16 samples per second and two wavelet coefficients, while the beta band uses 32 samples per second and five wavelet coefficients. This demonstrates that in this context, alpha band analysis is significantly more computationally efficient than the beta band. Furthermore, given the algorithm’s high processing speed in the alpha band and efforts to minimize computations, there is clear potential for deploying this model in low-cost device environments or simple applications.
The obtained results represent the culmination of multiple experiments. The proper selection of training data, their quality, and quantity leads to superior models. These results provide some key insights into their performance in classifying EEG signals: the metrics of sensitivity, specificity, and precision demonstrate the model’s ability to accurately detect seizures and categorize them. Models constructed with prefiltered data consistently demonstrates superior performance compared to those built with unfiltered data, as evidenced by higher average accuracy for both the alpha and beta bands. This highlights the importance of prefiltering signals beforehand to enhance the quality of the analyses.
4.8. Comparison with Other Works
To evaluate the proposed approach, its performance is compared with other models in the current state-of-the-art. The comparative results are shown in
Table 5. Theseresults demonstrate that state-of-the-art models generally exhibit excellent detection accuracy. However, it is clear that as researchers strive to improve accuracy, the complexity of the models also increases. Many studies do not consider this factor, fail to mention it, or address it only superficially. Since the primary focus of these works is on achieving the highest accuracy regardless of computational cost, discussions about implementing these algorithms on low-cost devices or simple applications are often overlooked.
To highlight our model’s ability to reduce computational cost,
Table 5 includes the operational time per second of the signal, where such data were provided. For studies that did not supply this information, it is marked as “not available” (N/A), where in most of them the computational complexity is very high that even they ignored this issue and just focused on the accuracy. The studies [
14,
24,
26] examined the algorithm’s processing time, achieving excellent speed results. In this regard, our approach ranked second among the fastest, surpassed only by [
14].
These findings encourage further exploration of the model’s efficiency in real-time scenarios. The remaining studies either do not provide processing speed data for their models or only briefly mention computational cost without further elaboration.
Drawing from these findings, it becomes evident that the proposed model demonstrates proper performance compared to other methods. Further enhancements could be achieved by improving the dataset quality, such as employing appropriate feature selection methods for training or considering new feature functions. Additionally, exploring better SVM models by varying the amount of data input for training, adjusting the slack parameter C, or utilizing kernel functions could lead to improvements. Despite exhibiting slightly lower performance than other reviewed models, the proposed model showcases its effectiveness with a satisfactory accuracy of 92.82% for the alpha band and 90.55% for the beta band. These findings suggest that the model could be utilized in epilepsy attack detection tasks, potentially extending its operation to real-time measurement devices, given the model’s low computational complexity.