Enhancing EEG-Based Emotion Detection with Hybrid Models: Insights from DEAP Dataset Applications
Abstract
:1. Introduction
2. Related Works
3. Research Design and Methodology
3.1. Dataset Overview
- Prior studies using the DEAP dataset have shown that emotional ratings tend to cluster around the mid-point, making 6.5 a more effective threshold for distinguishing pronounced emotional states;
- Using 5 would lead to an imbalanced dataset, as many ratings naturally fall around this median value, while 6.5 ensures a clearer separation between high and low categories;
- This threshold aligns with established research, where values between 6 and 7 are often employed for improved emotion classification;
- Focusing on stronger emotional responses enhances model generalization and reduces noise from subjective variations.
- Standardization: Preprocessed and labeled data ensure reproducibility.
- Rich Annotations: Multi-dimensional ratings enable nuanced analysis.
- Public Accessibility: Facilitates direct comparison with state-of-the-art methods.
3.2. Signal Preprocessing and Feature Extraction
3.2.1. Bandpass Filtering for Frequency Isolation
- Delta (0.5–4 Hz): Linked to deep relaxation and unconscious processing.
- Theta (4–8 Hz): Involved in memory retrieval and emotional regulation.
- Alpha (8–13 Hz): Reflects relaxed alertness; suppressed during high arousal.
- Beta (13–30 Hz): Associated with active concentration and anxiety.
- Gamma (30–100 Hz): Tied to cross-modal sensory integration and peak arousal.
3.2.2. Power Spectral Density (PSD) Estimation
3.2.3. Feature Standardization
3.2.4. Feature Relevance to Emotion Recognition
- Neurophysiological Basis: Emotional processing is mediated by synchronized neural oscillations, which are more discriminative in the frequency domain [20].
- Noise Robustness: Spectral features (e.g., Alpha asymmetry) are less sensitive to transient artifacts than raw EEG voltages.
3.3. Implementation Tools and Techniques
Key Tools and Libraries
- MNE for EEG Preprocessing: MNE is a specialized library for processing, analyzing, and visualizing electrophysiological data, including EEG. Its comprehensive functionality makes it indispensable for tasks such as filtering, epoching, and feature extraction. In this study, MNE is primarily used for:
- –
- Bandpass Filtering: Isolating frequency bands (Delta, Theta, Alpha, Beta, and Gamma) that are critical for understanding emotional processing. For example, Alpha waves (8–12 Hz) are associated with relaxation, while Beta waves (12–30 Hz) are linked to active thinking and concentration.
- –
- Epoching: Segmenting continuous EEG data into time-locked epochs corresponding to specific emotional stimuli, ensuring precise alignment with experimental conditions.
- –
- Artifact Removal: Identifying and removing noise from EEG signals, such as eye blinks or muscle movements, to improve data quality.
MNE’s robust capabilities in handling EEG data make it a cornerstone of our preprocessing pipeline, ensuring accurate and reliable input for downstream machine learning models. - Pickle for Data Storage: Pickle is used for serializing and deserializing Python objects, enabling efficient storage and retrieval of preprocessed EEG data and model outputs. This is particularly useful for:
- –
- Saving intermediate results (e.g., filtered EEG data, extracted features) to avoid redundant computations.
- –
- Storing trained models for later evaluation or deployment, ensuring reproducibility and scalability.
- Keras for Deep Learning: Keras, a high-level deep learning API, is employed for building and training neural network models. Its user-friendly interface and seamless integration with TensorFlow make it ideal for developing complex architectures, such as Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks. In this study, Keras is used to:
- –
- Design models tailored for EEG-based emotion recognition, leveraging its flexibility to experiment with different architectures.
- –
- Implement advanced techniques like attention mechanisms, which enhance the model’s ability to focus on relevant EEG features for emotion classification.
- SciPy for Scientific Computing: SciPy provides essential mathematical and statistical functions for scientific computing. In this work, it is used for:
- –
- Signal processing tasks, such as computing power spectral density (PSD) to analyze frequency-domain characteristics of EEG signals.
- –
- Statistical analysis to validate the significance of results and ensure the robustness of the proposed models.
- Scikit-learn for Machine Learning: Scikit-learn is a versatile library for traditional machine learning tasks, including classification, regression, and model evaluation. In this study, it is used for:
- –
- Implementing baseline models (e.g., Support Vector Machines, Random Forest) to compare against deep learning approaches.
- –
- Evaluating model performance using metrics such as accuracy, precision, recall, and F1-score, ensuring a comprehensive assessment of classification results.
3.4. Benchmarking with Classical Machine Learning
3.4.1. Model Selection and Rationale
- K-Nearest Neighbors (KNN):A non-parametric method effective for small datasets with local patterns. It is suitable for capturing similarities in spectral features across trials. In our implementation, we set k = 7 based on cross-validation results.
- Support Vector Machine (SVM): Maximizes the margin between classes using kernel tricks. We selected the Radial Basis Function (RBF) kernel to handle non-linear decision boundaries in EEG data, with C = 1.0 for regularization.
- Decision Tree (DT): Provides interpretable rules based on feature thresholds.
- Random Forest (RF): An ensemble of DTs averaging predictions. Robust to noise through bootstrap aggregation [26].
3.4.2. Data Preparation
- ;
- ;
- .
3.5. Deep Learning with Autoencoders and Transformers
- Reduce the high dimensionality of EEG data while preserving discriminative features.
- Capture long-range temporal dependencies in EEG signals, which are critical for emotion dynamics.
3.5.1. Autoencoder for Dimensionality Reduction
- Input Layer: A 32-dimensional vector (one channel’s time-series).
- Encoder: Two dense layers (64 and 32 units) with ReLU activation, compressing the input into a 16-dimensional latent space.
- Decoder: Two dense layers (32 and 64 units) reconstructing the original input.
3.5.2. Transformer for Temporal Modeling
- Input Layer: Encoded EEG features (16 dimensions).
- Multi-Head Attention: Consisting of 4 attention heads with 128-dimensional embeddings.
- Feed-Forward Network: Two dense layers (128 and 64 units) with ReLU activation.
- Output Layer: Sigmoid activation for binary classification (high/low valence or arousal).
3.5.3. Hybrid LSTM-Transformer Model
- LSTM Layer: Consisting of 128 units capturing short-term EEG dynamics.
- Transformer Encoder: Applied to LSTM outputs for global context.
- Positional Encoding: Added to the input to preserve temporal order.
3.6. Comparative Analysis of Deep Architectures
3.6.1. Data Preparation
3.6.2. Model Architectures
- GRU: A Gated Recurrent Unit with 128–32 units, designed to balance computational efficiency and temporal modeling. Dropout (rate = 0.3) was applied after each GRU layer to prevent overfitting.
- 1D CNN: Three convolutional layers (128, 128, 64 filters) with kernel size = 3 and ReLU activation, followed by max-pooling and dense layers (64, 32, 16 units). Batch normalization improved convergence.
- BiLSTM: A bidirectional LSTM with 128–32 units, capturing past and future EEG context. Dropout (rate = 0.3) and layer normalization enhanced generalization.
3.7. Explainable AI (XAI) with SHAP
3.7.1. SHAP Methodology
- F is the set of all features;
- S is a subset of features excluding i;
- is the model’s prediction using only features in S.
- Efficiency: The sum of SHAP values equals the difference between the model’s prediction and the baseline (expected output).
- Symmetry: Two features contributing equally to all subsets receive the same SHAP value.
- Additivity: SHAP values for a combination of models are the sum of their individual SHAP values.
3.7.2. Application to EEG Emotion Recognition
- Frontal Alpha Asymmetry: Higher Alpha power in the left frontal cortex (AF3) was strongly associated with positive valence, consistent with neuroscientific studies [28].
- Occipital Gamma Activity: Increased Gamma power in occipital channels (O1, O2) correlated with high arousal, reflecting heightened sensory processing during emotional stimuli.
- Channel Importance: Frontal (AF3, AF4) and occipital (O1, O2) channels were consistently ranked as the most influential, aligning with their roles in emotional regulation and visual processing.
3.7.3. Key Insights
- Neurophysiological Plausibility: The model’s reliance on frontal Alpha and occipital Gamma aligns with established EEG biomarkers of emotion [29].
- Feature Importance: SHAP identified the most discriminative EEG features, enabling targeted feature engineering in future studies.
- Model Transparency: By explaining individual predictions, SHAP enhances trust in the model’s decisions, critical for real-world applications like mental health monitoring.
4. Results and Discussion
4.1. Performance of Classical Machine Learning Models
4.2. Dimensionality Reduction with Autoencoders
4.3. Hybrid Architectures for Improved Performance
4.4. Recurrent Neural Networks for Temporal Data
4.5. Explainability with SHAP
4.6. Refining the Model and Feature Selection
- Dimensionality reduction by excluding less influential features like Gamma.
- Optimizing the weighting of features to enhance model robustness.
- Exploring SHAP-based feature fusion to better integrate the most impactful information.
4.7. Comparison with Other Real-Time Methods
4.8. Time Complexity of the Models
4.9. Real-Time Emotion Detection System
- Data Acquisition: EEG signals are captured using a wearable EEG device or a prerecorded dataset, ensuring compatibility with both live and offline scenarios.
- Preprocessing: The raw EEG data are preprocessed to remove noise and artifacts, ensuring high-quality input for the model.
- Feature Extraction: The BiLSTM model processes the 2 s EEG segments, extracting meaningful features that capture the temporal dynamics of the signals.
- Emotion Classification: The extracted features are classified into emotional states (e.g., valence and arousal) using the trained BiLSTM model.
- Visualization and Feedback: The results are displayed in real time, with the EEG signal plot and the detected emotional state presented in an intuitive user interface.
- Mental Health Monitoring: The tool can be used to monitor emotional states in individuals with mental health conditions, such as depression or anxiety, providing clinicians with real-time insights into their patients’ emotional well-being. This could enable timely interventions and personalized treatment plans.
- Human–Computer Interaction (HCI): In HCI applications, the system can be integrated into adaptive interfaces that respond to the user’s emotional state. For example, a computer system could adjust its behavior based on whether the user is frustrated, relaxed, or engaged, enhancing user experience and productivity.
- Adaptive Learning Systems: In educational settings, the tool could be used to detect students’ emotional states during learning activities. This information could help educators tailor their teaching methods to maintain student engagement and improve learning outcomes.
- Gaming and Entertainment: The system could be integrated into gaming platforms to create immersive experiences that adapt to the player’s emotions in real time, enhancing engagement and enjoyment.
5. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
EEG | Electroencephalography |
PSD | Power Spectral Density |
ICA | Independent Component Analysis |
STFT | Short-Time Fourier Transform |
CNN | Convolutional Neural Network |
LSTM | Long Short-Term Memory |
BiLSTM | Bidirectional Long Short-Term Memory |
SVM | Support Vector Machine |
DEAP | Database for Emotion Analysis using Physiological Signals |
KNN | K-Nearest Neighbors |
RF | Random Forest |
DT | Decision Tree |
GRU | Gated Recurrent Unit |
SHAP | SHapley Additive exPlanations |
XAI | Explainable Artificial Intelligence |
RBF | Radial Basis Function |
FIR | Finite Impulse Response |
FFT | Fast Fourier Transform |
MNE | Magnetoencephalography and EEG processing software |
AI | Artificial Intelligence |
GDPR | General Data Protection Regulation |
AF3, AF4, O1, O2 | EEG Electrode Positions (International 10–20 system) |
References
- Mauss, I.B.; Robinson, M.D. Measures of emotion: A review. Cogn. Emot. 2009, 23, 209–237. [Google Scholar] [CrossRef]
- Govindaraju, V.; Thangam, D. Emotion Recognition in Human–Machine Interaction and a Review in Interpersonal Communication Perspective. In Human–Machine Collaboration and Emotional Intelligence in Industry 5.0; IGI Global: Pennsylvania, PA, USA, 2024; pp. 329–343. [Google Scholar] [CrossRef]
- Lan, Z. EEG-based Emotion Recognition Using Machine Learning Techniques. Ph.D. Thesis, Nanyang Technological University, Singapore, 2018. [Google Scholar] [CrossRef]
- Jerritta, S.; Murugappan, M.; Nagarajan, R.; Wan, K. Physiological Signals Based Human Emotion Recognition: A Review. In Proceedings of the 2011 IEEE 7th International Colloquium on Signal Processing and Its Applications, Penang, Malaysia, 4–6 March 2011; pp. 410–415. [Google Scholar] [CrossRef]
- Adolphs, R. Cognitive Neuroscience of Human Social Behavior. Neuropsychologia 2003, 41, 119–126. [Google Scholar] [CrossRef]
- Böcker, K.B.E.; van Avermaete, J.A.G.; van den Berg-Lenssen, M.M.C. The International 10–20 System Revisited: Cartesian and Spherical Co-ordinates. Brain Topogr. 1994, 6, 231–235. [Google Scholar] [CrossRef] [PubMed]
- Czarnocki, J. Will New Definitions of Emotion Recognition and Biometric Data Hamper the Objectives of the Proposed AI Act? In Proceedings of the 2021 International Conference of the Biometrics Special Interest Group (BIOSIG), Darmstadt, Germany, 15–17 September 2021; pp. 1–4. [Google Scholar] [CrossRef]
- Mouazen, B.; Bendaouia, A. Leveraging Machine Learning and Clinical EEG Data for Multiple Sclerosis: A Systematic Review. SSRN 2024. [Google Scholar] [CrossRef]
- Akhand, M.A.H.; Maria, M.A.; Kamal, M.A.S.; Murase, K. Improved EEG-Based Emotion Recognition Through Information Enhancement in Connectivity Feature Map. Sci. Rep. 2023, 13, 13804. [Google Scholar] [CrossRef] [PubMed]
- Chen, J.X.; Zhang, P.W.; Mao, Z.J.; Huang, Y.F.; Jiang, D.M.; Zhang, Y.N. Accurate EEG-Based Emotion Recognition on Combined Features Using Deep Convolutional Neural Networks. IEEE Access 2019, 7, 44317–44328. [Google Scholar] [CrossRef]
- Alhagry, S.; Fahmy, A.A.; El-Khoribi, R.A. Emotion Recognition Based on EEG Using LSTM Recurrent Neural Network. Int. J. Adv. Comput. Sci. Appl. 2017, 8, 10. [Google Scholar] [CrossRef]
- Li, Y.; Huang, J.; Zhou, H.; Zhong, N. Human Emotion Recognition with Electroencephalographic Multidimensional Features by Hybrid Deep Neural Networks. Appl. Sci. 2017, 7, 1060. [Google Scholar] [CrossRef]
- Xing, X.; Li, Z.; Xu, T.; Shu, L.; Hu, B.; Xu, X. SAE+LSTM: A New Framework for Emotion Recognition from Multi-Channel EEG. Front. Neurorobot. 2019, 13, 37. [Google Scholar] [CrossRef]
- Pichandi, S.; Balasubramanian, G.; Chakrapani, V. Hybrid Deep Models for Parallel Feature Extraction and Enhanced Emotion State Classification. Sci. Rep. 2024, 14, 24957. [Google Scholar] [CrossRef]
- Chen, J.X.; Jiang, D.M.; Zhang, Y.N. A Hierarchical Bidirectional GRU Model with Attention for EEG-Based Emotion Classification. IEEE Access 2019, 7, 118530–118540. [Google Scholar] [CrossRef]
- Koelstra, S.; Muhl, C.; Soleymani, M.; Lee, J.-S.; Yazdani, A.; Ebrahimi, T.; Pun, T.; Nijholt, A.; Patras, I. Deap: A database for emotion analysis using physiological signals. IEEE Trans. Affect. Comput. 2011, 3, 18–31. [Google Scholar] [CrossRef]
- Jacob12138xieyuan. EEG-Based Emotion Recognition on DEAP. Available online: https://github.com/Jacob12138xieyuan/EEG-Based-Emotion-Recognition-on-DEAP (accessed on 10 February 2024).
- Soleymani, M.; Lichtenauer, J.; Pun, T.; Pantic, M. A Multimodal Database for Affect Recognition and Implicit Tagging. IEEE Trans. Affect. Comput. 2011, 3, 42–55. [Google Scholar] [CrossRef]
- Newson, J.J.; Thiagarajan, T.C. EEG Frequency Bands in Psychiatric Disorders: A Review of Resting State Studies. Front. Hum. Neurosci. 2019, 12, 521. [Google Scholar] [CrossRef] [PubMed]
- Gramfort, A.; Luessi, M.; Larson, E.; Engemann, D.A.; Strohmeier, D.; Brodbeck, C.; Parkkonen, L.; Hämäläinen, M.S. MNE Software for Processing MEG and EEG Data. NeuroImage 2014, 86, 446–460. [Google Scholar] [CrossRef]
- Welch, P. The Use of Fast Fourier Transform for the Estimation of Power Spectra: A Method Based on Time Averaging Over Short, Modified Periodograms. IEEE Trans. Audio Electroacoust. 1967, 15, 70–73. [Google Scholar] [CrossRef]
- Dwivedi, A.K.; Verma, O.P.; Taran, S. EEG-Based Emotion Recognition Using Optimized Deep-Learning Techniques. In Proceedings of the 2024 International Conference on Signal Processing and Integrated Networks (SPIN), Noida, India, 21–22 March 2024; pp. 372–377. [Google Scholar] [CrossRef]
- Li, X.; Zhang, Y.; Tiwari, P.; Song, D.; Hu, B.; Yang, M.; Zhao, Z.; Kumar, N.; Marttinen, P. EEG-Based Emotion Recognition: A Tutorial and Review. ACM Comput. Surv. 2022, 55, 79. [Google Scholar] [CrossRef]
- Unde, S.A.; Shriram, R. Coherence Analysis of EEG Signal Using Power Spectral Density. In Proceedings of the 2014 International Conference on Computational Science and Network Technology (CSNT), Bhopal, India, 7–9 April 2014; pp. 871–874. [Google Scholar] [CrossRef]
- Koelstra, S.; Muhl, C.; Soleymani, M.; Lee, J.-S.; Yazdani, A.; Ebrahimi, T.; Pun, T.; Nijholt, A.; Patras, I. EEG-based emotion recognition using wavelet decomposition and SVM. IEEE Trans. Affect. Comput. 2012, 3, 18–31. [Google Scholar] [CrossRef]
- Atkinson, J.; Campos, D. Random forest classification of EEG-based emotional states. Neurocomputing 2016, 172, 212–220. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Newry, UK, 2017; Volume 30, Available online: https://proceedings.neurips.cc/paper_files/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf (accessed on 10 February 2024).
- Coan, J.A.; Allen, J.J.B. Frontal EEG asymmetry as a moderator and mediator of emotion. Biol. Psychol. 2004, 67, 7–50. [Google Scholar] [CrossRef]
- Olbrich, S.; Arns, M. EEG biomarkers for major depressive disorder and schizophrenia: A review. Clin. EEG Neurosci. 2013, 44, 3–16. [Google Scholar] [CrossRef]
- Chabachib. EEG-Based Emotion Detection. GitHub Repository. 2024. Available online: https://github.com/Chabachib/EEG-Based-Emotion-Detection (accessed on 10 February 2024).
Study | Objective | Techniques Used | Results (Accuracy) |
---|---|---|---|
Chen et al. [10] | Valence, Arousal Classification | CNN-based Feature Extraction | 85.57% (Arousal), 88.76% (Valence) |
Alhagry et al. [11] | Temporal EEG Analysis | LSTM with Cross-Validation | 85.65% (Valence), 85.45% (Arousal) |
Li et al. [12] | Hybrid Model for Emotion Recognition | CNN + LSTM | 75.21% (Four-class classification) |
Xing et al. [13] | Subject-Independent Recognition | SAE + LSTM | 81.10% (Valence), 74.38% (Arousal) |
Pichandi et al. [14] | Feature Extraction with SVM | AlexNet + DenseNet + PCA + SVM | 95.54% (Valence), 97.26% (Arousal) |
Chen et al. [15] | Attention-based Model | Hierarchical Bidirectional GRU with Attention | 69.3% (0.5 s EEG segments) |
Models | Valence | Arousal | ||
---|---|---|---|---|
Training Accuracy | Testing Accuracy | Training Accuracy | Testing Accuracy | |
KNN | 71.3% | 56.2% | 71.2% | 59.4% |
SVM | 64.6% | 56.6% | 68.3% | 58.6% |
Decision Tree | 93.7% | 49.6% | 89.6% | 59.4% |
Random Forest | 99.3% | 60.9% | 98.9% | 58.2% |
Models | Training Accuracy | Testing Accuracy |
---|---|---|
Transformers | 79.85% | 64.15% |
Transformers + LSTM | 84.26% | 69.81% |
Transformers + LSTM + additive fusion | 90.98% | 73.58% |
LSTM + Autoencoders | 87.56% | 67.88% |
Models | Training Accuracy | Testing Accuracy |
---|---|---|
BiLSTM | 85% | 94% |
GRU | 83% | 93% |
CNN | 84% | 91% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mouazen, B.; Benali, A.; Chebchoub, N.T.; Abdelwahed, E.H.; De Marco, G. Enhancing EEG-Based Emotion Detection with Hybrid Models: Insights from DEAP Dataset Applications. Sensors 2025, 25, 1827. https://doi.org/10.3390/s25061827
Mouazen B, Benali A, Chebchoub NT, Abdelwahed EH, De Marco G. Enhancing EEG-Based Emotion Detection with Hybrid Models: Insights from DEAP Dataset Applications. Sensors. 2025; 25(6):1827. https://doi.org/10.3390/s25061827
Chicago/Turabian StyleMouazen, Badr, Ayoub Benali, Nouh Taha Chebchoub, El Hassan Abdelwahed, and Giovanni De Marco. 2025. "Enhancing EEG-Based Emotion Detection with Hybrid Models: Insights from DEAP Dataset Applications" Sensors 25, no. 6: 1827. https://doi.org/10.3390/s25061827
APA StyleMouazen, B., Benali, A., Chebchoub, N. T., Abdelwahed, E. H., & De Marco, G. (2025). Enhancing EEG-Based Emotion Detection with Hybrid Models: Insights from DEAP Dataset Applications. Sensors, 25(6), 1827. https://doi.org/10.3390/s25061827