*3.6. Classifiers*

Figure 10 shows that most classifiers were linear (48%) and neural networks (41%); a few papers used nearest neighbors (7%) and ensemble methods (5%). Consequently, it is worth mentioning that the following algorithms have become increasingly popular for EEG-based emotion recognition applications:


During our considered period, this review did not find studies that applied non-linear Bayesian classifiers as hidden Markov models (HMM).

#### *3.7. Performance vs. the Number of Classes-Emotions*

The performance of almost all systems was evaluated using accuracy, except for two systems in which one used area under the curve (AUC), and the other one presented an F1 measure. Unfortunately, EEG datasets are usually unbalanced, with one or two labeled emotions more numerous than the others, which is somewhat problematic for this approach. Thus, this situation could lead to biased classifications. Moreover, EEG datasets are typically unbalanced, and performance measures should be calculated to contextualize their outcomes. In our view, this is why such results are not entirely comparable among different studies.

In Figure 11, we present the relationship between systems and the number of classified emotions. Most systems use the VA or VAD spaces and classify each dimension as a bi-class (for instance, valence positive and negative; arousal high-value and low value) or tri-class problem (for example, valence positive, neutral, and negative; arousal and dominance high-value and low-value).

**Figure 11.** Percentage of systems with different numbers of classified emotions.

Arousal and valence have the highest usage percentages (25.8%). On the other hand, 16.1% categorized valence with three classes: Positive, neutral, and negative. Then, 9.7% classified three discrete emotions like sadness, love, and anger. Moreover, lastly, 6.5% ranked valence as two classes (positive and negative), four discrete emotions (happy, sad, fear, and relaxed), one discrete emotion (disgust), or emotions located in one of four quadrants of the VA space (high valence-high arousal, high valence–low arousal, low valence–high arousal, and low valence–low arousal).

Classifier performance should be evaluated, taking into account that accuracy would be inversely proportional to the number of detected emotions. In other words, classification accuracy should be higher than a random classification process (equal chance for each class). Thus, as classification classes increase, a random classification process would yield a lower accuracy. For instance, a two-class random classification process would be 50% accurate. Likewise, three classes would imply a 33% classification accuracy for a random classification process, and so on. Therefore, such accuracy metrics should provide the classification performance benchmark for our evaluations.

Although the results of the performance of the systems depend on many factors, it is possible to find some relationship between the number of classes, the type of emotions classified, and the accuracy obtained (Figure 12). The best results are obtained with two classes, either as discrete emotions or as positive or negative values in a dimensional space. The second-best value is found for the recognition of one negative discrete emotion like dislike or disgust. The result that the classification of one emotion does not obtain the best performance value could be explained by the fact that in our review, we observed that negative emotions are more challenging to classify and tend to yield smaller performance values.

Comparing approaches and results obtained through different BCI-based systems is complex. This is because each system uses diverse experimental methods for emotion elicitation, protocols to detect EEG signals, datasets, extraction and selection of features, classification algorithms, and generally speaking, each implementation has different settings. Ideally, systems should be tested under similar conditions, but that scenario is not ye<sup>t</sup> available. However, we can perform a comparative analysis to extract trends, bearing in mind such limitations.

**Figure 12.** Accuracy vs. types and number of classified emotions.

#### **4. Future Work**

Datasets developed for specific applications use passive methods to provoke emotions, such as IAPS, IADS, music videos, and film clips. Public databases, such as DEAP and SEED, use emotion elicitation through music videos and film clips, respectively. Few studies implement active emotion methods for provoking emotions, such as video games and flight simulators.

Going forward, we expect the generation of datasets that use active elicitation methods because these techniques simulate "real life" events better, and are more e fficient at emotion induction. However, the implementation of such types of studies requires a significantly more complex experimental setup.

Furthermore, the study of individual emotions has been recently trending. Some works include fear detection, an analysis that has applications in phobia investigation, and other psychiatric disorders. It is worth mentioning that our survey found that negative emotions are more challenging to detect than positive ones.

We did not find in the literature the EEG-based emotion recognition of mixed feelings that combine positive and negative a ffects sensed at the same moment, for instance, bittersweet feelings. These mixed emotions are interesting because they are related to the study of higher creative performance [141].

Feature extraction and selection are EEG-based BCI system components, which are continuously evolving. They should be designed based on a profound understanding of the brain's biology and physiology. The development of novel features is a topic that can contribute significantly to the improvement of results for emotion recognition systems. For instance, time-domain features are combined with frequency, time-frequency characteristics, channel location, and connectivity criteria. The development of novel feature extraction methods includes asymmetry discoveries in di fferent functioning brain segments, new electrode locations that provide more information, connectivity models (between channels), and correlations needed for understanding functionality.

These evolving features contend that EEG signals and their frequency bands are related to multiple functional and connectivity considerations. The study of the relationship between EEG and biological or psycho-emotional elements should improve going forward. Improved features could better capture individual emotion dynamics and also correlate characteristics across individuals and sessions.

A particularly interesting trend in feature extraction is to use deep neural networks. These systems receive raw data to avoid loss of information and take advantage of the neural networks functioning to obtain relevant features automatically.

The overall reported system accuracy results range from 53% to 90% for the classification of one or more emotions. However, there likely is a gap between real-world applications performed in real time, which presents enormous challenges compared to experiments conducted in a laboratory. Some authors sugges<sup>t</sup> that training datasets should be generated on a larger scale to overcome those challenges. Indeed, we believe it is reasonable that larger datasets could catalyze the research in this field. It is worth mentioning that a similar dynamic played out in the area of image recognition, which experienced a rapid expansion due to the generation of massive databases. Nevertheless, this e ffort for EEG datasets would likely require collaboration between various research groups to achieve emotions triggered by active elicitation methods.

Overall, we believe systems should be trained with larger sample sizes (and samples per subject), plus the use of real-time data. With such improved datasets, unsupervised techniques could be implemented to obtain comprehensive models. Moreover, these robust systems might allow for transfer learning, i.e., general models that can be applied successfully to particular individuals.
