1. Introduction
As an advanced function of the human brain, emotion plays an important role in daily life. Emotion recognition has high application value in the fields of commerce, medicine, education, and human-computer interaction, and has become a research area of great interest [
1]. In recent years, researchers usually use emotional materials, such as pictures, sounds, and videos, to induce subjects’ emotions, and analyze their physiological signals to obtain the regularity of emotional changes [
1]. In the study of emotion recognition, two emotional dimensions of Russell’s Valence-Arousal emotion model are usually used for emotion evaluation [
2]. Physiological signals, such as electroencephalography (EEG), electrocardiography (ECG), electromyography (EMG), galvanic skin response (GSR), and respiration rate (RR), are often used to reflect emotional states. The most commonly used is EEG, due to its good temporal resolution as well as acceptable spatial resolution. Furthermore, EEG has been widely used in brain-computer interfaces (BCIs), and the study of EEG-based emotion recognition may provide great value for improving user experience and performance of BCI applications.
At present, there are two major issues for EEG-based emotion recognition that require further investigation. One issue is that the accuracy of the emotion classification is generally low. A critical step in the EEG-based emotion recognition task is to extract features. Several EEG-based features, including Hjorth features [
3], fractal dimension features [
4], higher order spectra features [
5], power spectral density (PSD) features [
6], differential entropy (DE) features [
7], rational asymmetry (RASM) features [
8], and wavelet features [
9] have been successfully extracted and applied in emotion recognition. Some researchers have tried to concatenate the above features to extract more information and improve the performance of emotion recognition. However, Samara et al. [
10] studied the effect of different feature vectors on the classification accuracy of affective states, and found that combining extracted features to form a feature vector does not necessarily improve the classification accuracy. Another study [
11] showed that different subjects may have different sensitivities to different features. These findings are related to what is called the high-dimensionality issue in EEG, because not all of these features carry significant information about emotions. Irrelevant and redundant features increase the feature space, making pattern detection more difficult, and increases the risk of overfitting. It is, therefore, important to identify subject-dependent features that have a significant impact on the performance of the individual emotion recognition.
There are currently many methods available for feature extraction. For example, Arnau-González et al. [
12] selected features from the spatial domain and frequency by principal component analysis, and found that naive Bayes, support vector machine (SVM) RBF, and SVM-sigmoid classifiers can significantly improve the accuracy of classification, in which the best accuracy is 67.7% for arousal and 69.6% for valence. Zhong et al. [
13] developed a new approach for EEG feature selection, which transferred recursive feature elimination and implemented a linear least square SVM classification, so that the arousal accuracy reached 78.7% and the valence accuracy reached 78.8%. In addition, as a population-based stochastic optimization method, the particle swarm optimization (PSO) method, is becoming more and more popular in the field of feature extraction due to its simple mathematical operations, a small number of control parameters, fast convergence, and easy implementation. Several PSO variants have shown good performance in function optimization and feature selection [
14]. For example, Kumar et al. [
15] proposed a supervised PSO-based rough set for feature selection of the BCI multiclass motor imagery task, and the outcome outperformed other algorithms using the same dataset. As an important parameter in PSO, the inertia weight is often used to balance the global exploration and local exploitation of the search process. PSO with linearly-decreasing inertia weight (LDW-PSO) [
16] was recommended due to its good performance on optimization problems. However, few studies currently use the PSO method for EEG-based emotion recognition.
The other issue is that there are few studies on real-time emotion recognition systems, especially for online BCI systems. Most previous studies have focused on the offline analysis of EEG-based emotion recognition. A few online EEG-based emotion recognition systems have been reported. For example, Sourina et al. [
17] developed an online EEG-based emotion recognition system for music therapy, but they did not present the experimental results and system performances. Pan et al. [
18] proposed an EEG-based brain-computer interface (BCI) system for emotion recognition to detect two basic emotional states (happiness and sadness), and achieved an average online accuracy of 74.17%. Daly et al. [
19] reported an affective BCI system to detect the current affective states of the subjects while listening to music. Using band-power features and support vector machine (SVM) classifier, they achieved a real-time average accuracy of 53.96% for three arousal levels of eight subjects. Nevertheless, the online BCI-based emotion recognition is still in its infancy.
In this study, a real-time EEG-based emotion recognition BCI systems with high accuracy was developed. In order to improve the discrimination ability of the EEG features, we proposed a modified PSO-based feature selection algorithm for emotion recognition. After signal preprocessing, EEG features extracted from the time, frequency, and time-frequency-domains were used to find emotion-relevant features and correlate them with emotional states. We combined features from different scale dimensions and found the optimal features selection using the modified PSO-based method. In particular, a new strategy called multi-stage linearly-decreasing inertia weight (MLDW) was adopted in the PSO method, for the purpose of easily refining the process of decreasing the inertia weight. Furthermore, we used the SVM classifier to recognize emotions based on the emotion model. The above algorithm was verified using EEG data from the DEAP dataset [
20]. Our results show that the accuracy of EEG-based emotion recognition can be significantly improved by using our modified MLDW-PSO feature selection, and this modified MLDW-PSO can be used for online emotion recognition. To further demonstrate our proposed method is valid in emotion recognition, we designed an online experiment evoked by videos stimuli. The modified MLDW-PSO algorithm was introduced into the BCI system as a feature selection tool, and then the BCI system can output real-time feedback of online emotion recognition results.
The rest of this paper is organized as follows.
Section 2 introduces methods, including data acquisition and stimuli, graphical user interface (GUI) and BCI paradigm, data processing and algorithms, where algorithms, including the feature extraction, feature selection and MLDW-PSO, and the classification.
Section 3 presents three experiments and their results, including two offline experiments and one online experiment.
Section 4 offers a discussion of the results.
Section 5 provides our conclusion.
4. Discussion
In order to effectively identify the emotional state based on EEG, this study focuses on the feature selection methods for constructing feature sets. In this study, we performed two offline experiments using the DEAP dataset and designed an online emotion recognition experiment, all of which used MLDW-PSO method for feature selection. In the offline experiments, different dimensional features were first extracted, then feature selection methods were applied to find the best feature combination, and finally the four emotion types were classified by the SVM classifier. Among the three feature selection methods of Relief, standard PSO, and MLDW-PSO, we found that the MLDW-PSO-based feature selection algorithm achieved the highest accuracy for all the subjects in DEAP dataset. To further validate the efficiency of MLDW-PSO-based feature selection, we developed a real-time emotion recognition system to recognize subjects’ positive and negative emotional states while watching video clips. In the online emotion recognition, the MLDW-PSO feature selection method was used to obtain high accuracy for all 10 healthy subjects. All of the results showed that our proposed MLDW-PSO feature selection method improves the performance of EEG-based emotion recognition.
In our offline experiments, the MLDW-PSO feature selection method achieved the highest average accuracy, which may be due to the following three reasons. First, subject-dependent model was applied to analyze affective states. A recent study [
11] claimed that subject-dependent emotion recognition usually performs better than subject-independent emotion recognition. Second, the PSO algorithm heuristically searches for the best combination based on the classification accuracy as the fitness function, rather than just roughly combining features. Third, a new group of nonlinear strategies, called MLDW, was proposed to easily refine the process of decreasing the inertia weight. The results suggest that the PSO with w
6 strategy is a good choice for solving unimodal problems due to its fast convergence speed.
In this study, the subject-dependent model was used to find suitable emotion related to EEG features. The subject-dependent model avoids problems related to variability between subjects, but an emotion classification model must be built for a specific subject. For the subjects in the DEAP dataset, the accuracy of all features without feature selection (in the subject-independent model) did not show differences from each individual feature (in Experiment II). In fact, different subjects have different sensitivities to different emotion-related features. In this respect, subject-dependent feature selection did enhance the performance in emotion recognition. All the accuracy of the three feature selection methods, (in the subject-dependent model) are significantly higher than the accuracy of each individual feature (
p < 0.05). These results are consistent with those in the literatures [
11,
31]. Taken together, we can conclude that the subject-dependent model can achieve higher accuracy than the subject-independent model due to the inter-subjects variability.
For Experiment II, the accuracy of the four-class emotion recognition using the MLDW-PSO algorithm reached 76.67%, which is higher than the latest results reported in the review [
32]. Previously, Chen et al. [
33] proposed a three-stage decision framework for recognizing four emotions of multiple subjects, and found that the classification accuracy for the same four emotions was 70.04%. Gupta et al. [
34] studied the channel-specific nature of EEG signals and proposed an effective method based on a flexible analytic wavelet transform to obtain the above four emotions with an emotion recognition accuracy of 71.43%. In addition, Zheng et al. [
8] studied stable patterns of EEG over time for emotion recognition using a machine learning approach, and achieved a classification accuracy of 69.67%. By comparison, our method is more effective in EEG-based emotion recognition.
For the real-time emotion recognition system, an average online accuracy of 89.50 ± 5.68% for recognizing two emotion states using the MLDW-PSO algorithm were attained, which is significantly higher than the chance level. Compared to some similar studies, we further found the availability of our emotion recognition system. Liu et al. [
35] proposed a real-time movie-induced emotion recognition system for identifying an individual emotional states, and achieved an overall accuracy 86.63 ± 0.27% in recognizing positive from negative emotions. Using stimuli materials similar to [
35], the average accuracy of [
36] discriminating self-induced positive emotions from negative emotions is 87.20 ± 8.74%. Jatupaiboon et al. [
37] classified happy and unhappy emotions using real-time emotion recognition system triggered by pictures and classical music, and achieved an average accuracy of 75.62 ± 10.65%. The classification accuracy of all these studies is lower than the accuracy of our system in identifying two emotional states.
In order to verify whether our online emotion recognition system can evoke positive and negative emotions, we plotted topographical maps of the average DE features across trials of ten subjects with happy or sad emotional states in four bands (theta, alpha, beta, and gamma bands). Specifically, the features were averaged across all ten healthy subjects and all trials. As shown in
Figure 5, the brain neural activity is different when watching negative videos and positive videos. The brain neural activity map of positive emotions show higher power than negative emotions. In the theta and beta bands, the right frontal lobe area and right temporal area were more activated in the positive emotion state than the negative emotion state. In the gamma band, the left occipital lobe and the right temporal lobe regions show higher power in the positive emotion than the negative emotion. These patterns are consistent with those reported in previous studies [
37,
38]. These findings also proved that our outstanding recognition accuracies were not the result of EMG activities.
There are few factors that may contribute to the high performance of our result. First, the emotion states of subject are easy to evoked, which thanks to the proper selection of stimulus material. Another possible factor is the setting of online feedback, which will not only focus participants’ attention during the trial, but also inspire the participant to adjust their strategies to regulate their emotions to be consistent with the emotions of the stimulating materials. Furthermore, the application of MLDW-PSO feature selection in BCI system is also an important factor. One possible reason is that the trained classifier used MLDW-PSO to capture more valid information from the new feature vector, thereby enlarging its ability to perform pattern recognition.