Cognitive State Classification Using Convolutional Neural Networks on Gamma-Band EEG Signals

Avital, Nuphar; Nahum, Elad; Levi, Gal Carmel; Malka, Dror

doi:10.3390/app14188380

Open AccessArticle

Cognitive State Classification Using Convolutional Neural Networks on Gamma-Band EEG Signals

by

Nuphar Avital

^1,2,

Elad Nahum

³,

Gal Carmel Levi

³ and

Dror Malka

^3,*

¹

Faculty of Education, Bar Ilan University, Ramat-Gan 5290002, Israel

²

Early Childhood Education, Talpiot College of Education, Holon 5810201, Israel

³

Faculty of Engineering, Holon Institute of Technology (HIT), Holon 5810201, Israel

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(18), 8380; https://doi.org/10.3390/app14188380

Submission received: 8 August 2024 / Revised: 13 September 2024 / Accepted: 14 September 2024 / Published: 18 September 2024

(This article belongs to the Special Issue Interdisciplinary Approaches and Applications of Optics & Photonics (AOP2024))

Download

Browse Figures

Versions Notes

Abstract

:

This study introduces a novel methodology for classifying cognitive states using convolutional neural networks (CNNs) on electroencephalography (EEG) data of 41 students, aimed at streamlining the traditionally labor-intensive analysis procedures utilized in EEGLAB. Concentrating on the 30–40 Hz frequency range within the gamma band, we developed a CNN model to analyze EEG signals recorded from the inferior parietal lobule during various cognitive tasks. The model demonstrated substantial efficacy, achieving an accuracy of 91.42%, precision of 71.41%, and recall of 72.51%, effectively distinguishing between high and low gamma activity states. This performance surpasses traditional machine learning methods for EEG analysis, such as support vector machines and random forests, which typically achieve accuracies between 70–85% for similar tasks. Our approach offers significant time savings over manual EEGLAB methods. The integration of event-related spectral perturbation (ERSP) analysis with a novel CNN architecture enables capture of both fine-grained and broad spectral EEG features, advancing the field of computational neuroscience. This research has implications for brain-computer interfaces, clinical diagnostics, and cognitive monitoring, offering a more efficient and accurate alternative to current EEG analysis methods.

Keywords:

EEG; CNN; gamma band; deep learning

1. Introduction

Electroencephalography (EEG) has emerged as a cornerstone technology in neuroscience, offering real-time insights into brain activity through non-invasive recording of neuronal electrical signals [1]. This technique has revolutionized our understanding of brain function and has found applications in various fields, from clinical diagnostics to brain-computer interfaces [2]. The present study aims to harness EEG data to classify brain states during cognitive tasks, focusing on the frequency range of 30–40 Hz, which includes part of the gamma band associated with high-level cognitive functions [3,4]. Despite these advancements, current EEG analysis procedures, particularly when using popular software such as EEGLAB (version 2024.0) [5], remain largely manual and time-consuming. Researchers often spend significant time on preprocessing, feature identification, and interpretation of complex spectral patterns, which can slow down research progress and limit the volume of data analyzed. Our study is motivated by the need to accelerate and automate these analytical procedures, allowing researchers to focus more on interpreting results and generating insights. By developing an automated approach for cognitive state classification, we aim to provide a tool that can significantly reduce the time and effort required for EEG analysis, potentially accelerating discoveries in cognitive neuroscience and improving clinical applications of EEG technology. Building on this foundation, recent advancements in EEG technology have enabled more precise and reliable data collection, allowing researchers to explore subtle changes in brain activity across different cognitive states [6]. The gamma band, typically defined as oscillations between 30 and 100 Hz, has been of particular interest due to its association with various cognitive processes, including attention, memory, and perceptual binding [7,8]. Our focus on the 30–40 Hz range allows us to capture a significant portion of this band while minimizing potential muscle artifacts that can contaminate higher frequencies [9]. To further refine our approach, four specific EEG channels corresponding to electrode positions over the inferior parietal lobule were chosen for analysis. This region is critically involved in higher-order cognitive processes such as working memory, attention, and executive function [10,11]. By focusing on these channels, the study aimed to capture the most relevant neural activity associated with cognitive tasks. The inferior parietal lobule has been implicated in numerous studies as a key area for cognitive control and decision-making [12,13], making it an ideal target for our investigation. To analyze this complex data, our approach leverages CNNs, which have shown remarkable success in various domains, including image classification and natural language processing [14,15]. In the field of EEG analysis, CNNs have demonstrated superior performance compared to traditional machine learning methods in tasks such as motor imagery classification, seizure detection, and mental workload estimation [16]. The choice of a CNN model for our study was also driven by its compatibility with emerging optical computing architectures that integrate CNN networks into optical chips. This alignment offers advantages for EEG analysis, including improved processing speed and energy efficiency. The optical CNN architecture allows for parallel processing of multiple inputs, ideal for EEG data’s multiple channels and time-frequency representations. The spatial nature of EEG data aligns well with CNN’s 2D convolution operations, enabling effective capture of local patterns. The scalability and noise-resilience of optical CNN implementations are beneficial for EEG analysis, allowing for more complex models and robust performance. This hardware-based approach facilitates real-time processing of massive datasets, offering low-volume 3D connectivity with large bandwidth and minimal heat production compared to electronic implementations [17]. A CNN model capable of classifying EEG data based on cognitive states was developed, potentially pushing the boundaries of brain-computer interfaces and contributing to practical applications in cognitive monitoring. The effectiveness of CNNs in EEG analysis has been further validated by recent studies. For instance, research has successfully applied CNNs to classify EEG signals for emotion recognition and have developed CNN-based approaches for automated sleep stage classification [16]. These advancements underscore the potential of deep learning techniques in decoding complex brain signals and extracting meaningful patterns from EEG data [18,19]. The implications of this research extend far beyond the laboratory, spanning multiple domains and offering significant impacts on human activity and well-being. In education, it could enable real-time adaptation of teaching strategies [20,21], while in workplaces, it could optimize environments for productivity [22]. The technology shows promise in mental health for monitoring attention-related conditions [23], in neuroergonomics for enhancing human-computer interaction [24], and in sports for optimizing training and performance [25]. Additionally, it could provide objective measures for meditation practices [26] and assist in cognitive rehabilitation for individuals with brain injuries or neurodegenerative diseases [27]. These diverse applications highlight the broad potential impact of this research on various aspects of human life and society. In essence, this study combines rigorous EEG analysis with advanced machine learning techniques to develop a tool for analyzing cognitive states during various tasks. By bridging the gap between neuroscience and artificial intelligence, our aim is to contribute to the growing field of computational neuroscience and pave the way for more sophisticated brain-computer interfaces. The following sections detail our methodology, including data collection, preprocessing, model development, and evaluation, followed by a presentation of our results and a discussion of their implications for future research in cognitive neuroscience. Our work builds upon the foundation laid by pioneers in the field of EEG analysis and machine learning, while introducing novel approaches to address the challenges of decoding complex brain signals. This research aims to advance the understanding of cognitive processes and contribute to the development of practical applications that can enhance human cognition and well-being. In the field of education, this research presents substantial advantages by facilitating personalized learning through real-time feedback mechanisms, allowing for the adjustment of instructional methods based on student attentiveness, and providing objective data to enhance traditional assessment practices. Additionally, it contributes to special education by customizing learning plans and enabling early intervention strategies. The research also supports the monitoring of stress and mental health conditions, thereby promoting timely intervention. Overall, the integration of electroencephalography (EEG) with machine learning technologies enhances our comprehension of cognitive processes and advances educational practices.

2. Methodology

2.1. Participants

The study involved 41 students from the Holon Institute of Technology in Israel, who were assessed for cognitive workload and neural engagement through their interaction with an academic article by E. Cohen et al., titled “Neural Networks within Multi-Core Optic Fibers” [17].

2.2. Experimental Apparatus and Procedures

Each participant initially read the article individually while being monitored using an EEG device (eego™mylab [28]) with a sampling frequency of 500 Hz to record neural activity in real-time. Following the reading phase, participants engaged in individual discussions with a researcher to elucidate their understanding and reflections on the article. This dual-phase approach enabled a nuanced analysis of both cognitive processing during the reading and the depth of interactive engagement, thereby providing a comprehensive evaluation of cognitive workload and neural responses as captured by the EEG. The study implements a comprehensive, multi-stage approach to analyze EEG data and develop a CNN model for detecting specific patterns of brain activity associated with cognitive states. This methodology integrates advanced signal processing techniques with machine learning algorithms to address the complexities of EEG data analysis and cognitive state classification. By focusing on the 30–40 Hz frequency range within the gamma band, known for its association with high-level cognitive functions, we aim to capture and classify subtle neural signatures of different cognitive processes.

2.3. Data Pre-Processing

The raw EEG data underwent a series of preprocessing steps to enhance signal quality and isolate the relevant neural activity. These steps were implemented using MATLAB and the EEGLAB toolbox (version 2024.0) [5], a widely used software package for EEG data analysis. The preprocessing pipeline began with data import and initial processing, which included loading cnt files using EEGLAB’s import function for ANT EEProbe .cnt files. Channel locations were then imported using a standardized electrode location file, which provides precise spatial information for each electrode. Following this, the data was re-referenced to the average of all scalp electrodes. This re-referencing step, while maintaining the original channel count, effectively reduced the data rank by one. Independent component analysis (ICA) decomposition was then performed using the extended infomax algorithm with the ‘extended’ parameter set to 1. The ICA decomposition was designed to automatically detect and adjust to the reduced rank of the data resulting from the average referencing step, ensuring an appropriate analysis of the re-referenced data [29,30]. Artifact rejection and cleaning were carried out using the clean raw data plugin, which incorporates artifact subspace reconstruction (ASR). This process applied specific criteria for flatline detection (10 s), channel correlation (0.7), line noise (4 standard deviations), burst detection (20 standard deviations), and window rejection (0.25 standard deviations). Bad data periods were also removed during this stage. The final step involved frequency-specific filtering using a FIR bandpass filter. This filter was applied with a low edge frequency of 26.5 Hz and a high edge frequency of 45 Hz, using a filter order of 200. The DC offset was removed as part of this filtering process. This comprehensive preprocessing pipeline was designed to isolate the frequency range of 26.5–45 Hz, which encompasses the gamma band of interest in this study, while effectively removing artifacts and improving signal quality. This comprehensive approach ensured the data were optimally prepared for subsequent analysis of gamma band activity (30–40 Hz) in the inferior parietal lobule region. Figure 1 illustrates the topographical distribution of EEG channels, with particular emphasis on specific channels highlighted with white backgrounds (3LD, 4LD, 4LC, 5LC). These channels, located in the inferior parietal lobule region, were of specific interest in our study due to their association with cognitive processes relevant to our research objectives. The figure employs a continuous color spectrum ranging from red through orange, yellow, green, to light blue, typically corresponding to varying levels of neural activity or power in EEG representations. Areas in red and orange generally indicate higher activity, blue regions suggest lower activity, and green and yellow represent intermediate levels. The central and parietal regions, including the highlighted channels, display cooler colors (light blue to green), potentially indicating relatively lower activity in these areas of interest, while the periphery shows warmer colors (yellow to red), suggesting higher activity in the frontal and occipital regions.

To fully leverage the complex spatial and temporal information captured by the EEG electrodes, event-related spectral perturbation (ERSP) analysis was conducted on epochs ranging from −100 ms to 900 ms relative to the event, focusing on the 30–40 Hz gamma frequency band. Wavelet transformation was applied with cycles set to [3 0.5] at the lowest and highest frequencies, yielding 191 output times. Baseline correction used the −200 ms to 0 ms period, with absolute power values for scaling. The analysis, performed on four channels in the inferior parietal lobule region, employed a binary classification scheme based on the 85th percentile of mean power across the 30–40 Hz range. Statistical significance was assessed at an alpha level of 0.05 using false discovery rate correction. Time points exceeding the power threshold were classified as high gamma activity. ERSP matrices were normalized to ensure consistency across datasets.

2.4. CNN Model

These ERSP representations, serving as image inputs, form the basis for our CNN approach. The CNN processes these time-frequency images, learning to recognize and classify distinctive patterns associated with different cognitive states. This transformation of raw EEG signals into ERSP images, sized 3 × 191, for deep learning analysis aims to reveal insights that might be missed by conventional analytical methods. The architecture of the network, as illustrated in Figure 2, is as follows:

The model begins with an image input layer designed for EEG data, configured to accept time-frequency representations with dimensions of 3 × 191 × 1. This layer incorporates a normalization step, standardizing the input data by scaling it to a common range, between 0 and 1. This normalization process is crucial for several reasons: it ensures equal contribution of all input features to the learning process, helps stabilize the training of deep neural networks, and mitigates the impact of varying scales across different frequency bands and time points. The core of our CNN model consists of eight intricately designed convolutional layers, each playing a crucial role in extracting hierarchical features from the EEG data. These layers utilize 3 × 3 kernels, likely with ‘same’ padding to preserve spatial dimensions throughout the network. The number of filters in our CNN follows an expanding-contracting pattern, starting with 32 filters, expanding to 64, 128, and 256 in the middle layers, and then contracting back to 128, 64, 32, and finally 1, creating a “bottleneck” architecture. This design allows the network to first capture a wide range of features at different levels of abstraction, from low-level signal characteristics to high-level cognitive state indicators, and then distill this information to focus on the most discriminative features for our EEG classification task. Following each convolutional layer (except the final one), a batch normalization layer is implemented, playing a critical role in enhancing the stability and efficiency of the learning process. This technique normalizes the inputs to each layer across a mini-batch, effectively reducing internal covariate shift—a phenomenon where the distribution of each layer’s inputs changes during training, potentially slowing down the learning process. Rectified Linear Unit (ReLU) activation functions are strategically applied after each batch normalization layer, introducing non-linearity into the model. The ReLU function, defined as [14]:

f {(x)}_{Re L U} = \max (0, x)

(1)

offers several advantages in deep learning architectures. Firstly, it helps mitigate the vanishing gradient problem that can occur with other activation functions, allowing for more effective training of deep networks. Secondly, ReLU activations promote sparsity in the learned representations, as they output zero for all negative inputs. This sparsity can lead to more efficient and interpretable feature representations. In our EEG classification task, ReLU activations enable the model to capture complex, non-linear relationships in the time-frequency domain of the EEG signals. This non-linear capability is crucial for distinguishing between different cognitive states that may not be linearly separable in the input space, allowing the network to learn increasingly abstract and task-relevant features as data flows through the layers. Dropout layers are incorporated throughout the network as a key regularization technique to prevent overfitting and enhance the model’s generalization capabilities. The dropout rates vary across different layers (20%, 30%, 40%, 50%, 40%, 30%, and 20%), with higher rates in the middle of the network where the number of features is largest. This variable dropout scheme is designed to balance the trade-off between model capacity and generalization. During training, dropout randomly sets a fraction of input units to 0 at each update, which helps prevent complex co-adaptations on the training data. It helps the model to learn more generalizable features from the EEG signals rather than memorizing specific patterns that might be unique to the training set or irrelevant to the classification task. The network concludes with an output structure designed to transform the learned features into meaningful predictions. A 1 × 1 convolutional layer with a single filter is applied, acting as a learned linear combination of the features from the previous layer. This compact representation effectively distills the complex, multi-dimensional feature space into a single channel of information most relevant to our classification task. Following this, a sigmoid activation function is applied, which squashes the output to a range between 0 and 1. The sigmoid function, defined as [14]:

f {(x)}_{s i g m} = \frac{1}{1 + e^{- x}}

(2)

produces a probability-like score that is directly interpretable in our binary cognitive state classification task. This configuration allows the network to output continuous values that can be interpreted as the likelihood or degree of a particular cognitive state, enabling nuanced predictions beyond simple binary classification. In our EEG analysis, this output structure provides a flexible framework for quantifying the degree of cognitive engagement or attention, rather than just its presence or absence. This nuanced output can be particularly valuable in applications requiring fine-grained analysis of cognitive states or in scenarios where the intensity or certainty of a cognitive state is as important as its binary presence. This nuanced output capability, allowing for quantification of cognitive engagement intensity, enhances the model’s utility in applications requiring analysis of mental states. To address the class imbalance in the EEG dataset, a weighted binary cross-entropy loss function was implemented. This imbalance stems from our data labeling approach, resulting in a minority of samples being labeled as positive (high gamma activity) and a majority as negative (baseline activity). To account for this imbalance, we employed the following loss function [31]:

L_{w B C E} = - \frac{1}{M} \sum_{m = 1}^{M} [w \times y_{t r u e} \times \log (y_{p r e d}) + (1 - y_{t r u e}) \times l o g (1 - y_{p r e d})]

(3)

where M is the number of training examples, w is the weight (equal to 2.5 in our study), y_true is the target label for training example m, and y_pred is the prediction for training sample m. This 1:2.5 weighting scheme was determined empirically through iterative testing, aiming to strike an optimal balance between model sensitivity and overall performance. This approach helps mitigate the risk of the model overlooking important but infrequent neural events, which is crucial in EEG analysis where missing key cognitive state transitions could have significant implications for interpretation and subsequent applications.

2.5. Evaluation

To assess the model’s proficiency in capturing these subtle cognitive variations and evaluate its overall performance, we employed a comprehensive set of metrics. The model’s training and subsequent evaluation were assessed using three key metrics, each providing unique insights into different aspects of its classification performance, with accuracy serving as a fundamental measure of overall classification effectiveness. It measures the overall correct classification rate, providing an assessment of how often the model’s predictions align with the actual outcomes. Accuracy is calculated as the ratio of correct predictions (both true positives and true negatives) to the total number of cases evaluated. The formula for calculating accuracy is [32]:

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(4)

where TP = true positives, TN = true negatives, FP = false positives, and FN = false negatives. While accuracy provides an overall measure of the model’s performance, it is important to consider more specific metrics that focus on the model’s ability to correctly identify positive instances. This leads us to consider precision and recall. Precision is a metric for assessing the reliability of positive predictions in classification models. It quantifies the proportion of true positive predictions among all instances predicted as positive by the model. Precision provides insight into the model’s ability to avoid falsely labeling negative instances as positive, which is particularly important in applications where false positives could have significant consequences. The formula for calculating precision is [32]:

Precision = \frac{T P}{T P + F P}

(5)

Complementing precision, recall offers another crucial perspective on the model’s performance, particularly in situations where identifying all positive instances is critical. Recall is a metric for evaluating a classification model’s sensitivity to positive instances. It represents the proportion of actual positive cases correctly identified by the model, providing insight into the model’s effectiveness in detecting all relevant instances. In our study, recall becomes especially important for assessing the model’s performance in identifying the less frequent, but potentially more significant, positive instances. The formula for calculating recall is [32]:

Recall = \frac{T P}{T P + F N}

(6)

Following the individual metric analyses, a comprehensive evaluation of the CNN model’s performance across different network architectures was conducted using the test set.

3. Decoding Results

The results of our study demonstrate the effectiveness of using EEG data in the 30–40 Hz gamma band range for classifying cognitive states associated with high-level cognitive functions. Figure 3 demonstrates topographical EEG activity patterns across three subjects, highlighting individual variations in brain activation during cognitive tasks. These visualizations emphasize the spatial distribution of neural activity, with warmer colors indicating higher activation levels. The observed inter-subject variability in activation patterns underscores the importance of considering individual differences in EEG analysis and supports the need for advanced classification techniques such as the CNN model developed in this study.

To further analyze the EEG data, ERSP analysis was conducted on the preprocessed data to examine spectral power changes over time and frequency. This analysis employs time-frequency decomposition methods to provide a high-resolution time-frequency representation of the EEG signal. The ERSP analysis examined spectral power changes in the 30–40 Hz range from −100 ms to 900 ms relative to the event, using a −200 ms to 0 ms baseline, with results expressed as percentage changes from baseline power to quantify event-related oscillatory activity in the gamma band. As illustrated in Figure 4, the ERSP plot reveals patterns of spectral power changes across time and frequency within the 30–40 Hz range. The ERSP plot reveals significant gamma band power increases around 500 ms and 800 ms post-stimulus (represented by warm colors), suggesting event-related synchronizations that may indicate heightened neural processing or cognitive engagement, with the later activity potentially reflecting sustained or secondary processing stages.

To fully leverage the insights provided by the ERSP analysis, it is crucial to correlate these time-frequency representations with the original EEG signal timestamps. This correlation is achieved by aligning each column in the ERSP plot with specific time points relative to experimental events marked in the original EEG data. By identifying when a feature occurs in the ERSP relative to an event and locating the corresponding event marker in the original EEG data, researchers can pinpoint where spectral changes occur in the continuous EEG recording. This alignment process creates a bridge between the time-frequency analysis and the raw EEG data, enabling an understanding of how spectral dynamics correspond to ongoing EEG activity throughout the experiment. Building upon this time-frequency representation, we developed a method to translate the continuous ERSP data into a binary classification of brain states. For each time point in the ERSP output, we calculated the mean power across the 30–40 Hz frequency range. This frequency-averaged power was compared to a threshold set at the 85th percentile of all power values in the dataset, a value chosen empirically after extensive testing to optimally balance sensitivity and specificity in detecting significant gamma activity. Time points where the mean power exceeded this threshold were classified as ‘1’, indicating high gamma activity, while those below were classified as ‘0’, representing baseline or lower gamma activity states. Figure 5 presents the result of this binary classification, where black regions represent time-frequency points exceeding the threshold, corresponding to periods of high gamma activity.

Our automated approach significantly accelerates the analysis process compared to traditional manual methods in EEGLAB. To demonstrate the superiority of our method compared to the traditional EEGLAB approach, we conducted a comparative analysis of processing times for various numbers of subjects (n). Both approaches follow similar analytical steps for preprocessing, but differ greatly in execution time, as illustrated in the Table 1:

To summarize the table, there is a stark contrast in processing times between the two methods. For preprocessing, our Automatic-CNN approach demonstrates a significant reduction in time across all sample sizes, with savings of up to 68%. The disparity is even more pronounced in the classification phase, where our method completes the task in less than one second regardless of sample size, compared to the traditional EEGLAB method, which requires up to 200 min for larger datasets. These time savings can substantially expedite research processes, allowing investigators to analyze neural activity more efficiently and potentially explore larger datasets or conduct more comprehensive studies within the same timeframe. Moreover, our system incorporates advanced features such as automatic average power calculation and classification, extending beyond EEGLAB’s native capabilities to provide a more comprehensive and efficient analysis. With this preprocessing and binary classification completed, we proceeded to train and evaluate our CNN model on the prepared dataset. Figure 6 illustrates the evolution of the model’s accuracy over the course of training iterations for both the training and validation datasets. The graph demonstrates how the model’s predictive accuracy improves and stabilizes as training progresses, offering insights into the learning process and the model’s ability to generalize to unseen data in the validation set.

Figure 7 displays the progression of precision scores for both training and validation sets across training iterations. In the context of EEG analysis for cognitive state classification, high precision indicates that when the model predicts a specific cognitive state, it is likely to be correct. This graph helps visualize how the model’s precision evolves during training, providing insights into its reliability in identifying positive instances.

Figure 8 presents the recall scores for both training and validation sets over the course of training iterations. This metric is crucial for assessing the model’s effectiveness in detecting all relevant instances of the cognitive state of interest. The graph allows us to visualize how the model’s sensitivity to positive cases changes throughout the training process, providing insights into its ability to identify the less frequent, but potentially more significant, positive instances.

Figure 9 illustrates the relationship between network complexity and the three key performance metrics on the test data: accuracy, precision, and recall. The x-axis represents increasing network depth, shown as the ratio of convolutional layers to total layers. This visualization allows for the observation of how the model’s performance on unseen data evolves with increasing complexity, providing crucial insights into the optimal architecture for the EEG classification task.

As evident from the figure, accuracy shows a consistent upward trend as the network depth increases, stabilizing at around 90% for the more complex architectures. Precision and recall, however, exhibit more nuanced behaviors. Precision peaks with the 9/39 architecture before slightly declining, while recall shows the most variability, reaching its highest point with the 11/47 configuration. Interestingly, the 7/31 architecture emerges as a compelling balance between performance and complexity. While not achieving the highest individual scores, it demonstrates robust performance across all metrics without the computational overhead of deeper networks. The 7/31 model maintains high accuracy (>90%) while offering a favorable trade-off between precision and recall. Furthermore, the marginal gains in performance beyond the 7/31 architecture do not justify the significant increase in computational resources and training time required for deeper networks. This analysis not only provides valuable insights into the trade-offs between model complexity and performance but also guides our decision to select the 7/31 architecture as the optimal configuration for EEG signal classification in the 30–40 Hz range. Leveraging this optimized architecture, our CNN-based approach for detecting specific brain activity patterns in the 30–40 Hz range of EEG data demonstrated promising results across training, validation, and test sets. The training set performance showed high accuracy at 91.93%, with precision and recall closely balanced at 73.56% and 73.14%, respectively. This high accuracy indicates that the model learned to distinguish effectively between the two classes (high and low gamma activity) during the training phase. The balanced precision and recall suggest that the model performed consistently in identifying positive cases and avoiding false positives during training. This balance is crucial, as it indicates that the model is not biased towards over-predicting or under-predicting the presence of high gamma activity. The relatively high values across all three metrics suggest that the model has captured meaningful patterns in the EEG data related to cognitive states. Moving to the validation set, we observed a slight decrease in accuracy to 89.98%. While the recall remained similar to the training set at 73.66%, there was a more substantial drop in precision to 64.99%. This pattern suggests that the model may be overfitting slightly to the training data, leading to more false positives when presented with new data in the validation set. The maintenance of recall indicates that the model is still effectively identifying true positive cases, but the decrease in precision suggests it is also incorrectly classifying some negative cases as positive. This slight performance drop is not unusual when moving from training to validation data and highlights the importance of using techniques such as regularization and early stopping to mitigate overfitting. The test set results demonstrate the model’s ability to generalize to completely unseen data, which is crucial for real-world applications. The accuracy on the test set was 91.42%, comparable to that of the training set, indicating robust performance. Precision and recall values were well-balanced at 71.41% and 72.51%, respectively, showing improvement over the validation set. This improvement suggests that the model has good generalization capabilities and has not overfitted to the validation data. The high accuracy and balanced precision and recall on unseen data are particularly encouraging, as they indicate that the model can reliably classify EEG signals into periods of high and low gamma activity in novel situations. Overall, these results indicate that our CNN model can effectively classify EEG data into periods of high and low gamma activity with high accuracy. The consistent performance across all three datasets suggests that the model has learned meaningful features from the EEG signals and can apply this knowledge to new, unseen data. This consistency is a strong indicator of the model’s robustness and its potential for practical applications in EEG analysis. The slight variations in performance across the datasets also provide valuable insights into the model’s learning dynamics and areas for potential improvement in future iterations. The balance between precision and recall in the test set indicates that the model is equally capable of identifying true positives and avoiding false positives. This balance is particularly important in EEG analysis, where both the correct identification of cognitive states and the avoidance of false alarms are crucial. In practical terms, this means that when the model identifies a period of high gamma activity, it is likely to be correct, and it is also likely to catch most instances of high gamma activity. This balanced performance is essential for applications in both research and clinical settings, where missed detections and false alarms can both have significant consequences. These results provide a strong foundation for the use of deep learning techniques in EEG analysis, particularly in the context of cognitive state classification based on gamma band activity. The high accuracy and balanced precision and recall suggest that this approach could be valuable in both research and clinical applications where the detection of specific brain activity patterns is important. For instance, this model could potentially be used in brain-computer interfaces, cognitive workload monitoring, or in clinical settings for the early detection of certain neurological conditions associated with changes in gamma band activity. The robust performance across different datasets also suggests that the model could be adaptable to various EEG recording setups and subject populations, although further testing would be needed to confirm this.

4. Discussion

Our study demonstrates the efficacy of using convolutional neural networks (CNNs) for classifying cognitive states based on EEG data in the 30–40 Hz gamma band range. The model’s high accuracy (91.42% on the test set) and balanced precision (71.41%) and recall (72.51%) indicate its potential as a reliable tool for EEG analysis in both research and clinical settings. These results not only align with recent advancements in deep learning applications for EEG data analysis but also surpass the performance of other machine learning methods such as support vector machines and random forests, which typically achieve accuracies between 70–85% for similar EEG classification tasks [21,22]. A key advantage of our CNN-based approach is the significant reduction in processing time compared to traditional manual methods, as shown in Table 1, with time savings of 68% in preprocessing and classification completed in under a second regardless of sample size. This efficiency gain enables increased research productivity, potential real-time cognitive state monitoring, and improved scalability across various study sizes. The model’s ability to detect specific brain activity patterns associated with high-level cognitive functions opens up new possibilities for understanding cognitive processes and their neural correlates. However, limitations such as the need for larger, more diverse datasets and potential overfitting issues should be addressed in future research. Additionally, expanding the analysis to other frequency bands and exploring multi-class classification could further enhance the model’s applicability. This study represents a significant step towards real-time cognitive state detection, with potential impacts spanning from basic neuroscience research to practical applications in education, brain-computer interfaces, and clinical diagnostics. Our approach has significantly enhanced and accelerated the exploration of brain activity and neural network behavior. This advancement paves the way for more efficient replication of brain activity patterns in artificial systems, potentially revolutionizing the development of optical chip technologies and sophisticated artificial intelligence applications.

5. Conclusions

Our study demonstrates an advancement in EEG signal analysis through the application of CNNs to the 30–40 Hz gamma band. The development of a CNN-based model for cognitive state classification, achieving 91.42% accuracy on the test set, shows improvement over some traditional machine learning approaches in this domain. The preprocessing pipeline integrating ERSP analysis with binary classification based on power thresholds effectively prepares complex EEG signals for machine learning analysis. The CNN architecture, tailored for EEG data, captures both fine-grained spectral features and broader patterns in brain signals. The consistent performance across training, validation, and test sets indicates learning and generalization capabilities. The balance between precision and recall suggests that our model is capable of identifying true positives while limiting false positives, which is valuable in EEG analysis where accurate detection and minimal false alarms are important. These results indicate potential for application in research and clinical settings that require detection of specific brain activity patterns. In the realm of behavioral sciences and neuroscience, our approach offers a more efficient method for analyzing brain activity patterns associated with specific behaviors and cognitive processes. This could lead to a more nuanced understanding of the neural correlates of various cognitive and behavioral processes, learning processes, and the effects of various interventions on neural activity. The ability to quickly process and classify EEG data could be particularly valuable in studies of neurological disorders, where identifying subtle changes in brain activity patterns is crucial. Furthermore, this approach has potential applications in the field of neuroeducation. By providing a more efficient way to analyze brain activity during learning tasks, it could help in developing more effective teaching methods tailored to individual cognitive patterns. This could lead to the design of adaptive learning environments that respond in real-time to a student’s cognitive state, optimizing the learning process. Additionally, in the context of neurofeedback and cognitive training, our method could offer more precise and timely feedback, potentially enhancing the effectiveness of these interventions.

Author Contributions

D.M. conceived the project and provided overall supervision. N.A. led the experimental work, organized the EEG data, and developed the core idea with the support and guidance of D.M., E.N. and G.C.L. conducted the simulations with assistance from D.M., N.A., E.N., G.C.L. and D.M. drafted the manuscript and response letter. E.N., G.C.L. and N.A. prepared the figures. All authors contributed to the review and assessment of the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Schomer, D.L.; da Silva, F.L. Niedermeyer’s Electroencephalography: Basic Principles, Clinical Applications, and Related Fields, 6th ed.; Oxford University Press: New York, NY, USA, 2012; pp. 1–1269. [Google Scholar]
Ramzan, M.; Dawn, S. A Survey of Brainwaves using Electroencephalography(EEG) to develop Robust Brain-Computer Interfaces(BCIs): Processing Techniques and Algorithms. In Proceedings of the 2019 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 10–11 January 2019. [Google Scholar]
Frid, A. Differences in phase synchrony of brain regions between regular and dyslexic readers. In Proceedings of the 2014 IEEE 28th Convention of Electrical & Electronics Engineers in Israel (IEEEI), Eilat, Israel, 3–5 December 2014. [Google Scholar]
Jensen, O.; Kaiser, J.; Lachaux, J.-P. Human gamma-frequency oscillations associated with attention and memory. Trends Neurosci. 2007, 30, 317–324. [Google Scholar] [CrossRef]
Sosa, O.A.P.; Quijano, Y.; Doniz, M.; Quero, J.E.C. Development of an EEG signal processing program based on EEGLAB. In Proceedings of the 2011 Pan American Health Care Exchanges, Rio de Janeiro, Brazil, 28 March–1 April 2011. [Google Scholar]
Mullen, T.R.; Kothe, C.A.E.; Chi, Y.M.; Ojeda, A.; Kerth, T.; Makeig, S.; Jung, T.-P.; Cauwenberghs, G. Real-time neuroimaging and cognitive monitoring using wearable dry EEG. IEEE Trans. Biomed. Eng. 2015, 62, 2553–2567. [Google Scholar] [CrossRef] [PubMed]
Shou, G.; Mosconi, M.W.; Ethridge, L.E.; Sweeney, J.A.; Ding, L. Resting-state Gamma-band EEG Abnormalities in Autism. In Proceedings of the 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Honolulu, HI, USA, 17–21 July 2018. [Google Scholar]
Nobukawa, S.; Wagatsuma, N.; Inagaki, K. Gamma Band Functional Connectivity Enhanced by Driving Experience. In Proceedings of the 2021 IEEE 3rd Global Conference on Life Sciences and Technologies (LifeTech), Kyoto, Japan, 10–12 March 2021. [Google Scholar]
Whitham, E.; Pope, K.; Fitzgibbon, S.; Lewis, T.; Clark, C.R.; Loveless, S.; Broberg, M.; Wallace, A.; DeLosAngeles, D.; Lillie, P.; et al. Scalp electrical recording during paralysis: Quantitative evidence that EEG frequencies above 20Hz are contaminated by EMG. Clin. Neurophysiol. Off. J. Int. Fed. Clin. Neurophysiol. 2007, 118, 1877–1888. [Google Scholar] [CrossRef] [PubMed]
Chen, Q.; Garcea, F.; Jacobs, R.; Mahon, B. Abstract Representations of Object-Directed Action in the Left Inferior Parietal Lobule. Cereb. Cortex 2018, 28, 2162–2174. [Google Scholar] [CrossRef] [PubMed]
Soran, B.; Xie, Z.; Tungaraza, R.; Lee, S.-I.; Shapiro, L.; Grabowski, T. Parcellation of human inferior parietal lobule based on diffusion MRI. In Proceedings of the 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, San Diego, CA, USA, 28 August–1 September 2012. [Google Scholar]
Boissonneau, S.; Lemaître, A.-L.; Herbet, G.; Ng, S.; Duffau, H.; Moritz-Gasser, S. Evidence for a critical role of the left inferior parietal lobule and underlying white matter connectivity in proficient text reading. J. Neurosurg. 2022, 138, 1433–1442. [Google Scholar] [CrossRef]
Liao, Y.-C.; Yang, C.-J.; Yu, H.-Y.; Huang, C.-J.; Hong, T.-Y.; Li, W.-C.; Chen, L.-F.; Hsieh, J.-C. Inner sense of rhythm: Percussionist brain activity during rhythmic encoding and synchronization. Front. Neurosci. 2024, 18, 1342326. [Google Scholar] [CrossRef]
Alzubaidi, L.; Zhang, J.; Humaidi, A.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 1–74. [Google Scholar] [CrossRef]
Junliang, C. CNN or RNN: Review and Experimental Comparison on Image Classification. In Proceedings of the 2022 IEEE 8th International Conference on Computer and Communications (ICCC), Chengdu, China, 9–12 December 2022. [Google Scholar]
Craik, A.; He, Y.; Contreras-Vidal, J. Deep learning for Electroencephalogram (EEG) classification tasks: A review. J. Neural Eng. 2019, 16, 031001. [Google Scholar] [CrossRef]
Cohen, E.; Malka, D.; Shemer, A.; Shahmoon, A.; Zalevsky, Z.; London, M. Neural networks within multi-core optic fibers. Sci. Rep. 2016, 6, 29080. [Google Scholar] [CrossRef]
Roy, Y.; Banville, H.; de Albuquerque, I.M.C.; Gramfort, A.; Falk, T.; Faubert, J. Deep learning-based electroencephalography analysis: A systematic review. J. Neural Eng. 2019, 16, 051001. [Google Scholar] [CrossRef]
Zhou, M.; Tian, C.; Rui, C.; Wang, B.; Niu, Y.; Hu, T.; Guo, H.; Xiang, J. Epileptic seizure detection based on EEG signals and CNN. Front. Neuroinform. 2018, 12, 95. [Google Scholar] [CrossRef] [PubMed]
Devi, D.; Sophia, S. GA-CNN: Analyzing student’s cognitive skills with EEG data using a hybrid deep learning approach. Biomed. Signal Process. Control 2024, 90, 105888. [Google Scholar] [CrossRef]
TaghiBeyglou, B.; Shahbazi, A.; Bagheri, F.; Akbarian, S.; Jahed, M. Detection of ADHD cases using CNN and classical classifiers of raw EEG. Comput. Methods Programs Biomed. Update 2022, 2, 100080. [Google Scholar] [CrossRef]
Hao, T.; Xu, K.; Zheng, X.; Li, J.; Chen, S.; Nie, W. Towards mental load assessment for high-risk works driven by psychophysiological data: Combining a 1D-CNN model with random forest feature selection. Biomed. Signal Process. Control 2024, 96, 106615. [Google Scholar] [CrossRef]
Neeraj; Singhal, V.; Mathew, J.; Behera, R.K. Detection of alcoholism using EEG signals and a CNN-LSTM-ATTN network. Comput. Biol. Med. 2021, 138, 104940. [Google Scholar] [CrossRef]
Barnova, K.; Mikolasova, M.; Kahankova, R.V.; Jaros, R.; Kawala-Sterniuk, A.; Snasel, V.; Mirjalili, S.; Pelc, M.; Martinek, R. Implementation of artificial intelligence and machine learning-based methods in brain–computer interaction. Comput. Biol. Med. 2023, 163, 107135. [Google Scholar] [CrossRef]
Zhao, T.; Zhang, J.; Wang, Z.; Alturki, R. An improved deep learning mechanism for EEG recognition in sports health informatics. Neural Comput. Appl. 2021, 35, 14577–14589. [Google Scholar] [CrossRef]
Munjal, R.; Varshney, T.; Choudhary, A.; Dhiman, R. Convolutional Neural Network Based Models for Identification of Brain State Associated with Isha Shoonya Meditation. In Proceedings of the 2023 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), Greater Noida, India, 3–4 November 2023. [Google Scholar]
Lenartowicz, A.; Loo, S.K. Use of EEG to diagnose ADHD. Curr. Psychiatry Rep. 2014, 16, 498. [Google Scholar] [CrossRef]
eego™mylab. ANT Neuro, [Online]. Available online: https://www.ant-neuro.com/products/eego-mylab (accessed on 7 August 2024).
Hyeonseok, K.; Justin, L.; Shannon, C.; Cedric, C.; Sven, H.; Makoto, M. ICA’s bug: How ghost ICs emerge from effective rank deficiency caused by EEG electrode interpolation and incorrect re-referencing. Front. Signal Process. 2023, 3, 1064138. [Google Scholar]
Independent Component Analysis for Artifact Removal-Issues with Data Rank Deficiencies. EEGLAB Wiki, [Online]. Available online: https://eeglab.org/tutorials/06_RejectArtifacts/RunICA.html#issues-with-data-rank-deficiencies (accessed on 13 September 2024).
Ho, Y.; Wookey, S. The Real-World-Weight Cross-Entropy Loss Function: Modeling the Costs of Mislabeling. IEEE Access 2020, 8, 4806–4813. [Google Scholar] [CrossRef]
Foody, G. Challenges in the real world use of classification accuracy metrics: From recall and precision to the Matthews correlation coefficient. PLoS ONE 2023, 18, e0291908. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Topographical distribution of EEG electrode placement. 64-channel EEG cap layout with key channels (3LD, 4LD, 4LC, 5LC) highlighted.

Figure 2. Architecture of CNN for EEG signal classification.

Figure 3. Topographical EEG activity patterns across three subjects showing individual variations in brain activation.

Figure 4. ERSP analysis showing power changes across time and frequency in the 30–40 Hz range, with warmer colors indicating increased power. Baseline percentage quantifies power changes relative to pre-stimulus activity.

Figure 5. Binary classification of EEG data based on ERSP analysis, with black regions indicating time-frequency points exceeding the 85th percentile power threshold. Baseline percentage quantifies power changes relative to pre-stimulus activity.

Figure 6. Training and validation accuracy across iterations for the CNN model in EEG signal classification.

Figure 7. Training and validation precision across iterations for the CNN model in EEG signal classification.

Figure 8. Training and validation recall across iterations for the CNN model in EEG signal classification.

Figure 9. CNN test set performance metrics for EEG classification across varying network depths.

Table 1. Comparison of processing times between EEGLAB and Automatic-CNN approaches for EEG data analysis across different numbers of subjects (n).

Tested Approach	n = 1		n = 10		n = 20
Tested Approach	Pre-Processing	Classification	Pre-Processing	Classification	Pre-Processing	Classification
EEGLAB	25 min	10 min	~250 min	~100 min	~500 min	~200 min
Automatic-CNN	8 min	<1 s	~80 min	<1 s	~160 min	<1 s

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Avital, N.; Nahum, E.; Levi, G.C.; Malka, D. Cognitive State Classification Using Convolutional Neural Networks on Gamma-Band EEG Signals. Appl. Sci. 2024, 14, 8380. https://doi.org/10.3390/app14188380

AMA Style

Avital N, Nahum E, Levi GC, Malka D. Cognitive State Classification Using Convolutional Neural Networks on Gamma-Band EEG Signals. Applied Sciences. 2024; 14(18):8380. https://doi.org/10.3390/app14188380

Chicago/Turabian Style

Avital, Nuphar, Elad Nahum, Gal Carmel Levi, and Dror Malka. 2024. "Cognitive State Classification Using Convolutional Neural Networks on Gamma-Band EEG Signals" Applied Sciences 14, no. 18: 8380. https://doi.org/10.3390/app14188380

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Cognitive State Classification Using Convolutional Neural Networks on Gamma-Band EEG Signals

Abstract

1. Introduction

2. Methodology

2.1. Participants

2.2. Experimental Apparatus and Procedures

2.3. Data Pre-Processing

2.4. CNN Model

2.5. Evaluation

3. Decoding Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI