Emotion Recognition by Correlating Facial Expressions and EEG Analysis

Aguiñaga, Adrian R.; Hernandez, Daniel E.; Quezada, Angeles; Calvillo Téllez, Andrés

doi:10.3390/app11156987

Open AccessArticle

Emotion Recognition by Correlating Facial Expressions and EEG Analysis

by

Adrian R. Aguiñaga

^1,*,†,‡

,

Daniel E. Hernandez

^1,‡,

Angeles Quezada

^1,†,‡

and

Andrés Calvillo Téllez

^2,‡

¹

Tecnológico Nacional de México Campus Tijuana, Tijuana 22414, Mexico

²

Instituto Politécnico Nacional, Tijuana 22435, Mexico

^*

Author to whom correspondence should be addressed.

^†

Current address: Av Castillo de Chapultepec 562, Tomas Aquino, Tijuana 22414, Mexico.

^‡

These authors contributed equally to this work.

Appl. Sci. 2021, 11(15), 6987; https://doi.org/10.3390/app11156987

Submission received: 30 June 2021 / Revised: 21 July 2021 / Accepted: 27 July 2021 / Published: 29 July 2021

(This article belongs to the Special Issue Multimodal Affective Computing)

Download

Browse Figures

Versions Notes

Abstract

:

Emotion recognition is a fundamental task that any affective computing system must perform to adapt to the user’s current mood. The analysis of electroencephalography signals has gained notoriety in studying human emotions because of its non-invasive nature. This paper presents a two-stage deep learning model to recognize emotional states by correlating facial expressions and brain signals. Most of the works related to the analysis of emotional states are based on analyzing large segments of signals, generally as long as the evoked potential lasts, which could cause many other phenomena to be involved in the recognition process. Unlike with other phenomena, such as epilepsy, there is no clearly defined marker of when an event begins or ends. The novelty of the proposed model resides in the use of facial expressions as markers to improve the recognition process. This work uses a facial emotion recognition technique (FER) to create identifiers each time an emotional response is detected and uses them to extract segments of electroencephalography (EEG) records that a priori will be considered relevant for the analysis. The proposed model was tested on the DEAP dataset.

Keywords:

affective computing; EEG; emotions; FER; machine learning; neural networks

1. Introduction

The study of emotions and how they interact with computer systems has advanced considerably due to the significant developments in machine learning and increased computing power, allowing the analysis of several physiological responses to extract emotional traits, such as the voice, facial expressions, body temperature, heart rate, body movements or brain signals [1,2].

Emotion recognition has become a topic of interest for researchers in different areas, especially for the computer science community seeking to establish emotional interactions between humans and computers. Emotions play a crucial role in several activities related to human intelligence, such as decision making, perception, human interaction and human cognition in general. The analysis of EEG signals has gained notoriety in the study of human emotions because of its non-invasive nature, as well as the developments in consumer-grade EEG devices, hence providing affordable and simple solutions for applications that require emotion recognition [3]. Several efforts have been made to analyze the physiological responses of emotions; some consider direct reactions such as facial expressions, while others focus on the analysis of biological signals such as the heart rate or brain activity [4,5,6,7].

The analysis of biological processes tends to focus on the fact that it is more difficult for the research subject to hide or fake a response to a stimulus, and therefore the response will be more reliable. However, the human body is complex, and many physiological reactions can be considered noise when interacting, for example, eye movement can disturb the information collected in electroencephalography (EEG) records. In addition, when we analyze cognitive phenomena, such as emotions, there is no explicit reference of when an emotional stimulus begins or ends, and the full spectrum of the signal is analyzed, assuming that there must be a trait within a period that characterizes the emotion. However, this assumes that the stimulus produced is expected and that there are no different responses throughout the experimental process, which is not crystal clear.

We propose to use facial expressions as time markers that dictate which moments in brain signals’ registration we should be focused on, and thus generate a more robust and accurate methodology. The structure of this proposal is shown in Figure 1. The first stage is to analyze the video records using a facial emotion recognition (FER) technique to obtain markers that will tell us which and when an emotion occurred. Later this information will be used to extract fragments of the signals from an EEG record (two seconds before the identifier appears) associated with these markers and create a collection of data that, a priori, can be categorized as linked to an emotional process. The information obtained will be subsequently processed using digital signal processing techniques and a recurrent neural network training phase.

Finally, we use the structure presented in [8] to summarize the highlights of this article:

A methodology is proposed that allows the identification of the analysis space for detecting emotions through EEG analysis and thus increasing the certainty that a behavior linked to emotion is being recognized.
A boosting methodology is used that uses two neural networks that operate together to strengthen the result.
ESL optimized networks are used to reduce the effects of noise.
An electrode reduction technique based on Brodmann’s regions is used.
The DEAP database is used as a reference point, and a recognition rate of $88.2 %$ ± 0.23 is obtained for the FER and $89.6 %$ ± 0.109 for the EEG analysis.

2. Materials and Methods

Affective computing emphasizes that emotions play an essential role in human behavior and decision-making processes; however, they were ignored in the design of most digital systems and this highlights the necessity to develop or optimize methodologies that bring us closer to achieving a more natural human–machine interaction.

2.1. Emotions

The definition of emotion used in this work is described by Scherer [9], who defines emotions as organicistic responses of the nervous system to an event or experience. We also rely on the theories of Ekman and Russell to label and delimit emotional stimuli [10,11]. Ekman establishes that there are basic emotions and that these are present in all humans regardless of their environment: happiness, sadness, disgust, fear, surprise, and anger. Russell defines the circumplex model of emotions that solves the problem of the level of activation of emotion by associating emotional states based on their levels of excitement and valence. The combination of them creates a four-class emotions model: happiness (high arousal–high valence (HA–HV)), anger (high arousal–low valence (HA–LV)), sadness (low arousal–low valence (LA–LV)), neutral (low arousal–high valence (LA–HV)). This model was implemented in our experimental process.

Facial Emotion Recognition (FER)

Facial expressions are more important than we usually think, since our brains have developed a remarkable ability to identify faces and expressions, either for social or survival purposes. Thus there are various studies focused on facial emotion recognition (FER), specialized in detecting movements in facial muscles [12,13,14]. Our work implements a machine learning technique implemented in [15] that proposed a convolutional neural network (CNN) optimized by extreme sparse learning (ESL) (as can be observed in Figure 2) algorithm, whose objective is to strengthen its implementation in real-life situations. Extreme learning machine (ELM) is a very competitive classification technique, especially for multi-class classification problems. It also requires minor optimization, which results in a simple implementation, fast learning, and better generalization performance [13].

The ESL and ELM were implemented by assuming that the underlying sparse representation of the natural signals or images is efficiently approximated by the linear combination of fewer dictionary elements. Furthermore, the dictionary is obtained by applying predefined transforms to the data or direct learning of training data, and since it usually leads to a satisfactory reconstruction, the function is defined by Equation (1):

m i n (X, D) m i n (X, D) ({∥Y - D∥}_{2}^{2}), ∥x_{i}∥ ⩽ N_{0},

(1)

where Y is the input set N.

The learning dictionary for the Y sparse representation can be satisfied by solving Equation (2):

D = [d_{1}, d_{2}, . . ., d_{m}] \in R^{N x M},

(2)

which is the learner over the complete dictionary in the form of

X = [d_{1}, d_{2}, . . ., d_{m}] \in R^{N x S}

. This represents the sparse matrix of the inputs, and

N_{0}

is the sparsity constraint.

The objective function of ELM can be summarized as proposed by the authors of [13], as shown in Equation (3):

m i n (β) ({∥H (X) β - Z∥}_{2}^{2} + {∥β∥}_{2}^{2}),

(3)

where X denotes the set of training samples, H is the hidden layer output matrix

(H (X) \in R L^{S}, L

is the number of nodes in the hidden layer,

β

is the output weight vector of length L, and Z is the vector of class labels of length S.

2.2. Signal and Behavior Analysis

Machine learning techniques allow a faster and more efficient way to process physiological or biological behaviors to find patterns in emotional stimuli records; such as the analysis of voice [16,17,18], heart rate [7], body temperature [19] and brain signals [20]. Each technique has certain advantages, for example, voice analysis is low-cost and non-invasive. However, it requires physiological responses, and much more sophisticated techniques, such as positron emission tomography (PET), that do not require that persons emit any physiological response are expensive and invasive. Each method can be implemented according to the experimental conditions. We implement an EEG analysis that, besides being considered noisy, is a low-cost, non-invasive technique that does not require significant technical capacities for its implementation.

EEG Analysis

Our brain interprets emotional stimuli through a series of organicistic responses from the central nervous system [9]. Several studies use ML to analyze the behavior of emotional situations by recognizing patterns in the signals collected from the brain cortex—either region—or by considering all available information [20,21]. However, one of the main challenges is that these patterns are sought in large time windows (the time the evoked potential lasts, which can range from a few seconds to minutes), which implies that many other events can affect the experimental process, such as eye movements, facial muscles or cognitive states unrelated to the experiment. The analysis through ML has been proven to diagnose medical and cognitive phenomena conditions successfully [5,22,23,24,25,26,27].

Performing an adequate EEG signals analysis is one of the most critical stages for improving the recognition rate. The signal analysis stage is based on the considerations published in [21], which can be observed in Figure 3 and are described as follows:

Filtering. A bandpass filter was implemented, with a cutoff of 0.2 to 47 Hz, to exclude all those frequencies found outside of a brain rhythm, and the information obtained for each of the rhythm ranges was subdivided: delta (0.2 to 3.5 Hz), theta (3.5 to 7.5 Hz), alpha (7.5 to 13 Hz), beta (13 to 28 Hz) and gamma (>28 Hz).
Electrode selection by Brodmann bounded regions. The electrode selection methodology proposed in [21] was implemented, where the Brodmann areas related to the areas of the cortex specialized in visual, auditory processing and the hypothalamus are linked. In this way, the analysis region is delimited only to those electrodes connected to the parietal, temporal, frontal and occipital lobes of the cortex since these could provide more information related to the emotional process. So we only consider the electrodes: F7, FC5, T7, CP5, P7, P3, O1, Pz, P4, P8, O2, CP6, T8, FC6 and F8. This process considerably reduces the amount of data to be processed, which helps to improve the processing time, specifically to 15 electrodes of a differentiated 10/20 scheme.
Blind source separation (BSS) is used to eliminate all that information from the electrodes that were outside our boundary region by Equations (4) and (5).

$P_{Y} (y (n)) = \prod_{i = 1}^{m} P_{y} (y_{i} (n)) \forall n,$

(4)

$S_{i} = e_{i} + \sum_{m = 1}^{n} r_{m} .$

(5)

Equation (4) is the probability distribution set and Equation (5) is the sum of the overlapped signals.
Feature extraction. We use a wavelet Daubechies 4 transform to extract the translation and scale coefficients (Equation (6)), which will serve as features in the recognition process. Likewise, the variance between the scales is analyzed and added as an extra feature for the analysis.

$d_{j, k} = 〈f (t), Ψ_{j, k} (t)〉 = \frac{1}{\sqrt{2^{j}}} \int f (t) Ψ (2^{- j} t - k) d t,$

(6)

where $Ψ_{j, k} (t)$ is the Mother wave for the four vanishing moments in Equation (7).

$\int_{\infty}^{\infty} t^{k} Ψ (t) = 0 f o r 0 ⩽ k < N .$

(7)

The Daubechies wavelets are orthogonal and biorthogonal but they are not symmetrical. The compact support that is the range over which they are non-zero is $[0, 2 N - 1]$ and these waveforms could be implemented in db2, 4, 8 and 16; however, there is not a rule that we can follow to select the vanishing level, and the four vanishing moments were made through experimentation.
The literature reports that most of the experimental proposals are based on complex prepossessing techniques, such as wavelet or matching pursuits techniques, to process the signals [4,28], and some other proposals analyze the signals without a pre-processing stage by using exhaustive methods instead of traditional digital signal processing [29,30].

2.3. EEG Neural Network Architecture

A recurrent neural network structure with four hidden layers was used, with the architecture shown in Figure 4; this configuration was extracted from an experimental process that can be consulted in [21], which determines that this kind of low complex neural network shows an excellent performance without increasing the computational burden. The network uses the one-dimensional convolutional structure to perform a depthwise convolution that acts separately on the channels, followed by a pointwise convolution that mixes channels. The three emotional states define the three classes as in the emotions model.

The experimental configuration for the analysis of the EEG signals in this work is:

Convolutional 1D neural network.
Ten-fold cross-validation.
Four hidden layers.
Alpha value of 0.0002.
Maximum limit of iterations = 200.
Relu function is used for each hidden layer.
Softmax output
60% training set, 20% validation, and 20% trial (10 reps per trial)

Each data vector is composed of 8064 kernel coefficients; a total of 270 stimuli were used, 90 for each emotional state.

2.4. Data Source

The DEAP database was used in this work [31]; this dataset has video recordings of the experimental process, the metadata is detailed, and their methodology is very descriptive. Thus, this database allows us to analyze the video images and detect facial expressions, which we can link with the recordings provided by the EEG signals.

The dataset contains EEG, EMG, EOG signals, galvanic skin response, temperature recording, and respiration rate records. A total of 32 participants were analyzed; the frontal video recordings of the faces of 22 of them are available. The video was recorded at 50 fps and using the h264 codec. It contains 32 .bdf files (BioSemi’s data format generated by the Actiview recording software), each with 48 recorded channels at 512 Hz (32 EEG channels, 12 peripheral channels, three unused channels, and one status channel).

3. Results

This work presents an average recognition rate of

89.6 %

± 0.109, which is superior to the 82.9% average performance calculated from 2016 to 2020 in [3]. Although some works present much higher recognition rates, this is due to the kind of analysis presented in them, that is, they present mono or bi-modal methods, while for multimodal works, such as the one shown here, results are obtained considerably more conservatively, as shown in Table 1.

The confusion matrix presented in Figure 5 shows that HALV (anger) is almost all recognized. Hence, the false-positive rate considerably affects their performance, as shown in Figure 6, which indicates that 10% is recognized as HALV in both cases; however, as can be seen in Figure 7, the results of the predictions are very stable, and overall recognition does not over-fit. Table 1 and Table 2 show that the recognition rates obtained in our implementation are pretty competitive with the average obtained in recent years. However, it is essential to note that our work focuses on increasing confidence in the emotional analysis process. However, we could also increase the recognition rate in multi-class scenarios when isolating the signals.

4. Discussion

The premise of this work is that most of the studies related to the analysis of emotions through EEG are being performed by analyzing the complete signals of the experimental process and that, despite the recognition rates, many other phenomena can be involved in the recognition task, for example, facial muscles or mental burdens. It must also be considered that brain activity is chaotic, and that during the experimental process there may be various cognitive functions that we cannot control, for example, if the person has an intrusive thought or is thinking about something else. Therefore, it is essential to add an extra level of verification, and thus we propose the delimitation methodology presented in this work.

Although many works are already implementing filters to eliminate some of the phenomena produced by the body’s natural dynamics, it is not very easy to characterize every one of them since we would have to map each one of the functions that the brain performs. In addition, this would change from person to person. Although we could obviate this limitation, there is also another one, which is perhaps more challenging to control—the emotional and mental state of the person, which, as mentioned, we would have to know whether the person is thinking about something else while undergoing the experiment, which is entirely unworkable. So this proposal, although it barely scratches the surface of the problem, helps identify the exact moments when a physiological response related to an emotion occurred and analyze the moment before it.

This proposal arises after publishing various works focused on EEG analysis, where although high recognition rates were achieved in experimental processes, it was complicated to replicate in reality since, in addition to having situations such as neuroplasty, we also noticed that people were distracted, remembered, rambled and imagined situations during the experimental processes. This could generate conditions entirely distinct for our study, so although this is not a definitive proposal, we believe that it helps us get a little closer to a real life implementation.

5. Conclusions

This work presents a competitive recognition rate of an emotional state through EEG signals, despite not focusing on increasing the recognition rate but on proposing a methodology that reduces the effects produced by the non-related phenomena that may occur during an experimental process and creating a cardinal correlation between the recognition and the experimental procedure.

One aspect to highlight for this work is the need to expand the scope of this work to a broader population sector. However, a lack of databases, such as DEAP, is a critical issue to overcome for future work, motivating us to create an original database that helps to corroborate our research. Another important aspect of this work is that the recognition stage sped up since the NNs are trained with shorter fragments of the signal instead of large amounts of traits that may or may not be related to the physiological response to an emotional stimulus.

An aspect not fully addressed in the document is the two machine learning techniques used to generate a more robust output. However, the result obtained in this research demonstrates the advantages of using ML techniques to link two phenomena that are, a priori, seen as two different research areas. Both the analysis of facial expressions and the EEG analysis have their fields of study. However, when combined, they generate a more robust and reliable analysis methodology and open the possibility of using many other physiological responses of emotions to consolidate the results.

Finally, we understand that the process presented in this work depends on the recognition rate of the FER process and, despite the optimization techniques that have been implemented, these techniques should improve it to achieve more precise references of when emotions occur; however, this could be improved as the analysis and ML methodologies advance.

Author Contributions

Conceptualization, A.R.A.; methodology, A.R.A.; software, A.R.A.; validation, A.R.A., D.E.H., A.Q. and A.C.T.; investigation, A.R.A. and D.E.H.; resources, A.Q. and A.C.T.; data curation, A.R.A. and D.E.H.; writing—original draft preparation, A.R.A.; writing—review and editing, D.E.H.; visualization, A.Q. and A.C.T.; supervision, A.R.A., D.E.H., A.Q. and A.C.T.; project administration, A.R.A.; funding acquisition, A.Q. and A.C.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received funding from the National Technological of Mexico in the projects: Apply machine learning techniques to correlate EEG and facial expressions produced by emotional states and Feature engineering and Machine Learning for emotion recognition in EEG signals.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The DEAP dataset can be download from: http://www.eecs.qmul.ac.uk/mmv/datasets/deap/.

Acknowledgments

Thanks to the National Technological of Mexico for making this work possible.

Conflicts of Interest

The authors declare no conflict of interest.

References

Torres, E.P.; Torres, E.A.; Hernández-Álvarez, M.; Yoo, S.G. Emotion Recognition Related to Stock Trading Using Machine Learning Algorithms With Feature Selection. IEEE Access 2020, 8, 199719–199732. [Google Scholar] [CrossRef]
Li, Y.; Ma, R.; Zhao, H.; Qiu, S.; Hu, Z. Predicting the Change on Stock Market Index Using Emotions of Market Participants with Regularization Methods. In Proceedings of the 2017 13th International Conference on Computational Intelligence and Security (CIS), Hong Kong, China, 15–18 December 2017; pp. 607–610. [Google Scholar] [CrossRef]
Suhaimi, N.S.; Mountstephens, J.; Teo, J. EEG-Based Emotion Recognition: A State-of-the-Art Review of Current Trends and Opportunities. Comput. Intell. Neurosci. 2020, 19, 171–179. [Google Scholar] [CrossRef] [PubMed]
Murugappan, M. Human emotion classification using wavelet transform and KNN. In Proceedings of the 2011 International Conference on Pattern Analysis and Intelligent Robotics (ICPAIR), Kuala Lumpur, Malaysia, 28–29 June 2011; Volume 1, pp. 148–153. [Google Scholar] [CrossRef]
Moschona, D.S. An Affective Service based on Multi-Modal Emotion Recognition, using EEG enabled Emotion Tracking and Speech Emotion Recognition. In Proceedings of the 2020 IEEE International Conference on Consumer Electronics—Asia (ICCE-Asia), Seoul, Korea, 1–3 November 2020; pp. 1–3. [Google Scholar] [CrossRef]
Jones, D.R. Short Paper: Psychosocial Aspects of New Technology Implementation BT-HCI in Business, Government, and Organizations; Springer International Publishing: Cham, Switzerland, 2018; pp. 606–610. [Google Scholar]
Guo, H.W.; Huang, Y.S.; Chien, J.C.; Shieh, J.S. Short-term analysis of heart rate variability for emotion recognition via a wearable ECG device. In Proceedings of the 2015 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS), Okinawa, Japan, 28–30 November 2015; pp. 262–265. [Google Scholar] [CrossRef]
Kwon, S. Att-Net: Enhanced emotion recognition system using lightweight self-attention module. Appl. Soft Comput. 2021, 102, 107101. [Google Scholar] [CrossRef]
Scherer, K.R. What are emotions? And how can they be measured? Soc. Sci. Inf. 2005, 44, 695–729. [Google Scholar] [CrossRef]
Ekman, P.; Friesen, W.V.; O’Sullivan, M.; Chan, A.; Diacoyanni-Tarlatzis, I.; Heider, K.; Krause, R.; LeCompte, W.A.; Pitcairn, T.; Ricci-Bitti, P.E.; et al. Universals and cultural differences in the judgments of facial expressions of emotion. J. Personal. Soc. Psychol. 1987, 4, 712–717. [Google Scholar] [CrossRef]
Russell, J.A. A circumplex model of affect. J. Personal. Soc. Psychol. 1980, 39, 1161–1178. [Google Scholar] [CrossRef]
McDuff, D.; Mahmoud, A.; Mavadati, M.; Amr, M.; Turcot, J.; el Kaliouby, R. AFFDEX SDK: A Cross-Platform Real-Time Multi-Face Expression Recognition Toolkit. In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems, CHI EA ’16, San Jose, CA, USA, 7–12 May 2016; ACM: New York, NY, USA, 2016; pp. 3723–3726. [Google Scholar] [CrossRef]
Shojaeilangari, S.; Yau, W.; Nandakumar, K.; Li, J.; Teoh, E.K. Robust Representation and Recognition of Facial Emotions Using Extreme Sparse Learning. IEEE Trans. Image Process. 2015, 24, 2140–2152. [Google Scholar] [CrossRef] [PubMed]
Candra Kirana, K.; Wibawanto, S.; Wahyu Herwanto, H. Facial Emotion Recognition Based on Viola-Jones Algorithm in the Learning Environment. In Proceedings of the 2018 International Seminar on Application for Technology of Information and Communication, Aizu-Wakamatsu, Japan, 1–3 November 2018; pp. 406–410. [Google Scholar] [CrossRef]
Rodriguez Aguinaga, A.; Realyvásquez-Vargas, A.; López, R.M.Á.; Quezada, A. Cognitive Ergonomics Evaluation Assisted by an Intelligent Emotion Recognition Technique. Appl. Sci. 2020, 10, 1736. [Google Scholar] [CrossRef] [Green Version]
Longobardi, T.; Sperandeo, R.; Albano, F.; Tedesco, Y.; Moretto, E.; Di Sarno, A.D.; Dell’Orco, S.; Maldonato, N.M. Co-regulation of the voice between patient and therapist in psychotherapy: Machine learning for enhancing the synchronization of the experience of anger emotion: An experimental study proposal. In Proceedings of the 2018 9th IEEE International Conference on Cognitive Infocommunications (CogInfoCom), Budapest, Hungary, 22–24 August 2018; pp. 113–116. [Google Scholar] [CrossRef]
Umamaheswari, J.; Akila, A. An Enhanced Human Speech Emotion Recognition Using Hybrid of PRNN and KNN. In Proceedings of the 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), Faridabad, India, 14–16 February 2019; pp. 177–183. [Google Scholar] [CrossRef]
Mustaqeem; Kwon, S. 1D-CNN: Speech Emotion Recognition System Using a Stacked Network with Dilated CNN Features. Comput. Mater. Contin. 2021, 67, 4039–4059. [Google Scholar] [CrossRef]
Hayano, J.; Tanabiki, T.; Iwata, S.; Abe, K.; Yuda, E. Estimation of Emotions by Wearable Biometric Sensors Under Daily Activities. In Proceedings of the 2018 IEEE 7th Global Conference on Consumer Electronics (GCCE), Nara, Japan, 9–12 October 2018; pp. 240–241. [Google Scholar] [CrossRef]
Adrian, R.A.; Miguel Angel, L.R.; del Rosario, B.F. Classification model of arousal and valence mental states by EEG signals analysis and Brodmann correlations. Int. J. Adv. Comput. Sci. Appl. 2015, 6, 230–238. [Google Scholar] [CrossRef] [Green Version]
Aguiñaga, A.R.; Ramirez, M.A.L. Emotional states recognition, implementing a low computational complexity strategy. Health Inform. J. 2016, 24, 146–170. [Google Scholar] [CrossRef]
Gemein, L.A.; Schirrmeister, R.T.; Chrabąszcz, P.; Wilson, D.; Boedecker, J.; Schulze-Bonhage, A.; Hutter, F.; Ball, T. Machine-learning-based diagnostics of EEG pathology. NeuroImage 2020, 220, 117021. [Google Scholar] [CrossRef] [PubMed]
Llorente, D.; Ballesteros, M.; Salgado Ramos, I.D.J.; Oria, J.I.C. Deep learning adapted to differential neural networks used as pattern classification of electrophysiological signals. IEEE Trans. Pattern Anal. Mach. Intell. 2021. [Google Scholar] [CrossRef] [PubMed]
Tziridis, K.; Kalampokas, T.; Papakostas, G.A. EEG Signal Analysis for Seizure Detection Using Recurrence Plots and Tchebichef Moments. In Proceedings of the 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC), Online. 27–30 January 2021; pp. 184–190. [Google Scholar] [CrossRef]
Vahid, A.; Mückschel, M.; Stober, S.; Stock, A.K.; Beste, C. Applying deep learning to single-trial EEG data provides evidence for complementary theories on action control. Commun. Biol. 2020, 3, 112. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chen, Y.; Chang, R.; Guo, J. Effects of Data Augmentation Method Borderline-SMOTE on Emotion Recognition of EEG signals Based on Convolutional Neural NetWork. IEEE Access 2021. [Google Scholar] [CrossRef]
Holzinger, A. Explainable AI and Multi-Modal Causability in Medicine. i-com 2021, 19, 171–179. [Google Scholar] [CrossRef]
Shahnaz, C.; Hasan, S.S. Emotion recognition based on wavelet analysis of Empirical Mode Decomposed EEG signals responsive to music videos. In Proceedings of the 2016 IEEE Region 10 Conference (TENCON), Singapore, 22–25 November 2016; pp. 424–427. [Google Scholar] [CrossRef]
Park, M.; Oh, H.; Jeong, H.; Sohn, J. Eeg-based emotion recogntion during emotionally evocative films. In Proceedings of the 2013 International Winter Workshop on Brain-Computer Interface (BCI), Gangwo, Korea, 18–20 February 2013; pp. 56–57. [Google Scholar] [CrossRef]
Daşdemir, Y.; Yıldırım, S.; Yıldırım, E. Classification of emotion primitives from EEG signals using visual and audio stimuli. In Proceedings of the 2015 23nd Signal Processing and Communications Applications Conference (SIU), Malatya, Turkey, 16–19 May 2015; pp. 2250–2253. [Google Scholar] [CrossRef]
Koelstra, S.; Muhl, C.; Soleymani, M.; Lee, J.S.; Yazdani, A.; Ebrahimi, T.; Pun, T.; Nijholt, A.; Patras, I. DEAP: A Database for Emotion Analysis using Physiological Signals. IEEE Trans. Affect. Comput. Spec. Issue Nat. Affect. Resour. Syst. Build. Eval. 2012, 3, 18–31. [Google Scholar] [CrossRef] [Green Version]
Dzedzickis, A.; Kaklauskas, A.; Bucinskas, V. Human Emotion Recognition: Review of Sensors and Methods. Sensors 2020, 20, 592. [Google Scholar] [CrossRef] [PubMed] [Green Version]

Figure 1. Facial emotion recognition process to create time set points.

Figure 2. FER neural network configuration implementation (Phyton TensorFlow configuration).

Figure 3. EEG analysis, treatment methodology, processing and feature extraction.

Figure 4. EEG neural network architecture configuration (Phyton TensorFlow configuration).

Figure 5. Confusion matrix of the HAHV (Happiness), HALV (Anger), LALV (Sadness) recognition rate.

Figure 6. Recognition correspondence for each class and false positives, coefficients parameters: AUC: 0.924, CA: 0.874, F1: 0.871, precision: 0.896.

Figure 7. Lift analysis for the training and test comparison. Performance of a classification model at all classification thresholds.

Table 1. State-of-the-art recognition rates obtained by analyzing electrocardiogram (ECG), SKT, Galvanic Skin Response (GSR), Heart Rate Variability (HRV), Respiration Rate Analysis (RR), Skin Temperature Measurements (SKT), Electromyogram (EMG), Electrooculography (EOG) presented in [32].

Measurament Methods	Average Accuracy
-	1 and 2 Categories	3 or More Categories
ECG	79%	78%
SKT	78%	77%
GSR	77%	80%
HRV	90%	73%
EOG	79%	85%
RR	79%	81%
EMG	-	81%

Table 2. State-of-the-art performance comparisons between analysis techniques reported in [3]. The averages of each of the techniques as compiled are shown and it is calculated that the percentage of the total performance in the 2016–2020 period is 82.9%.

Classifier	Average Performance
Neural network	85.80%
Suport vector machine	77.80%
Random forest	98.20%
K-nearest neighbor	88.94%
Multilayer perceptron	78.16%
Bayes	69.62%
Fisherface	91%
Extreme learning machine	87.06%
K-means	78.06%
Linear discriminant analysis	71.30%
Gaussian process	71.30%

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Aguiñaga, A.R.; Hernandez, D.E.; Quezada, A.; Calvillo Téllez, A. Emotion Recognition by Correlating Facial Expressions and EEG Analysis. Appl. Sci. 2021, 11, 6987. https://doi.org/10.3390/app11156987

AMA Style

Aguiñaga AR, Hernandez DE, Quezada A, Calvillo Téllez A. Emotion Recognition by Correlating Facial Expressions and EEG Analysis. Applied Sciences. 2021; 11(15):6987. https://doi.org/10.3390/app11156987

Chicago/Turabian Style

Aguiñaga, Adrian R., Daniel E. Hernandez, Angeles Quezada, and Andrés Calvillo Téllez. 2021. "Emotion Recognition by Correlating Facial Expressions and EEG Analysis" Applied Sciences 11, no. 15: 6987. https://doi.org/10.3390/app11156987

APA Style

Aguiñaga, A. R., Hernandez, D. E., Quezada, A., & Calvillo Téllez, A. (2021). Emotion Recognition by Correlating Facial Expressions and EEG Analysis. Applied Sciences, 11(15), 6987. https://doi.org/10.3390/app11156987

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Emotion Recognition by Correlating Facial Expressions and EEG Analysis

Abstract

1. Introduction

2. Materials and Methods

2.1. Emotions

Facial Emotion Recognition (FER)

2.2. Signal and Behavior Analysis

EEG Analysis

2.3. EEG Neural Network Architecture

2.4. Data Source

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI