**4. Results**

#### *4.1. MMN Results*

Examples of frequency and duration mismatch responses compared to a standard tone are illustrated in Figure 6. In Figure 6a,b negative peaks between 200-400 ms which indicated mismatched ERP responses were found in both T8 and in-ear EEG signals. Di fferent types of mismatched ERP signals, such as frequency and duration mismatched may vary in amplitudes, but general shapes of signals contain significant negative peaks around 200–400 ms [40]. These negative peaks of mismatch duration (Figure 6d) and frequency of mismatch (Figure 6e) from traditional MMN experiments, shown in the dotted line, from the previous study [40] are also shown in Figure 6 for comparison. The dotted lines in Figure 6d,e also show negative peaks between 200 and 400 ms. The examples of ERP responses to standard beeps are shown in Figure 6c. In contrast to the mismatch responses, the negative peaks are not present between 200 and 300ms. This conforms to the theory in [39].

**Figure 6.** Examples of EEG after mismatch trials. (**a**) Example of frequency mismatch EEG event-related potential (ERP) response. (**b**) An example of duration mismatched EEG ERP response. (**c**) An example of an EEG ERP response after a standard beep. The blue and red lines in (**<sup>a</sup>**–**<sup>c</sup>**) show the in-ear and T8 EEG signal, respectively. (**d**,**<sup>e</sup>**) Duration and frequency mismatch responses from [40] for comparison. The dotted line in (**d**,**<sup>e</sup>**) show the ERP responses from similar traditional mismatch negativity (MMN) experiments to our work. The thin and thick lines in (**d**,**<sup>e</sup>**) show the MMN responses for specially designed experiments from [40].

Furthermore, the similarity between red and blue lines in all the plots in Figure 6a–c shows a high correlation between in-ear and T8 EEG signals. The correlation between T8 and in-ear EEG was approximately 0.8530 across all trials. These MMN results indicated that the signal measured by in-ear device was EEG, as its ERP response characteristics conformed to those of scalp EEGs. Additionally, in-ear EEG signal quality was similar to EEG measured at the nearby T8 scalp location.

The average frequency mismatch response compared to the standard tone is displayed in Figure 7. The red and blue lines showed similar patterns (signs of slopes) between T8, and in-ear EEG. This result supports the findings of [16,36], which reports a high correlation between in-ear, and T7 and T8 EEG signals. It was noted that different amplitudes exist for the red and blue lines, because the signals shown were averaged across all trials, rather than raw data comparison (as shown in Figure 7a–c).

**Figure 7.** Average EEG of mismatch and standard trials. (**a**) Average in-ear EEG ERP responses from all mismatch trials. (**b**) Average T8 EEG from all mismatch trials. (**c**) Average in-ear EEG after standard beeps. (**d**) Average T8 EEG after standard beeps.

The MMN results show that in-ear EEG highly correlates with T7 and T8 EEG signals. Furthermore, similar signal response to the theory in [39] shows that in-ear EEG signal could be accurately used in a standard ERP test. Hence the validity of in-ear EEG signal was substantiated.

#### *4.2. DEAP Data Analysis*

The emotion classification using T7 and T8 signals from DEAP dataset by SVM, as described in Section 2.4, was performed. Data from 32 subjects consisting of 40 trials per each subject were used for the classification. Ten-folded cross-validation was applied to suppress biases. In each classification, 36 trials were used as the training set and the other four were used for the test set. Ten different sets were trained and tested for each subject.

The accuracy achieved was approximately 69.85 percent for valence classification and 78.7 percent for arousal classification. The overall accuracy for classifying four emotions was approximately 58.12 percent.

Furthermore, the analysis of emotion classification using the T7 or T8 channel was conducted and compared. The accuracies of emotion classification using T7 were approximately 71.30% for valence, 76.67% for arousal, and 57.56% for 4 emotions (valence and arousal combined); and the accuracy from emotion classification using T8 were approximately 70.93% for valence, 77.20% for arousal, and 57.34% for 4 emotions (valence and arousal combined) accordingly.

The *t*-test result from SPSS (IBM Corp., New York, USA) indicated that there was no statistically significant difference in classifying emotions between T7 and T8. The accuracy of T7 was approximately 57.56 ± 15.19 and T8 was 57.34 ± 16.40. The *p*-value was 0.955 on both tails, which was less than 0.955, indicating that there was no significance difference between classifying emotion using T7 and T8.

The results show that T7 and T8 data could be used as a single channel for valence, arousal, and the simple emotion classification, as the classification accuracy is comparable to the multichannel classification model in [7].

#### *4.3. In-Ear EEG Emotion Classification*

Only two out of thirteen subjects, subjects four and 10, decided to put an in-ear EEG on the left. The measurement of raw EEG data showed no statistically significant difference between EEG collected from left and right ear (*p*-value = 0.95).

In-ear EEG signals were recorded while subjects were watching stimulating pictures during experiment, described in Section 3.5. The EEG signal was filtered using a 4th order Butterworth filter to notch out power line noise at 50 Hz. The signal was then separated into four frequency bands that were theta (4–8 Hz), alpha (8–12 Hz), beta (12–32 Hz), and gamma (30–48 Hz) by Butterworth bandpass filters. Six statistical parameters by Picard et al. [48] were used for signal feature extraction on a 3 s time-lapsed window. The SVM model described in Section 3.4 was used for classification. Ten-fold cross-validation was applied for classifying each subject's data. All the signal processing and classification was performed offline using Matlab (The MathWorks, Inc., Natick, MA, USA)

Binary classification was done by SVM on valence (positive or negative) and arousal (high or low). The four emotion classification was performed using the valence and arousal classification results, mapped onto the simplified valence–arousal emotional model in Figure 5. For example, positive valence and high arousal was classified as happy. Hence the simplified emotions could be classified into four groups: positive valence/high arousal, positive valence/low arousal, negative valence/high arousal, or negative valence/low arousal. Classification accuracy was calculated by comparing SVM classifications with subjects' own evaluations. The classification accuracy of in-ear EEG is shown in Table 1.


**Table 1.** Emotion classification result from each subject.

The emotion classification accuracy based on the valence–arousal emotion model was approximately 73.01% for valence, 75.70% arousal, and 59.23% for all four emotions. Subjects four and 10 inserted the in-ear EEG on the left while the rest inserted it on the right. Subject 12 was female.

The accuracy of emotion classification using the in-ear EEG from our experiment, and the T7 and T8 EEG signals from the DEAP dataset were comparable. According to multiple comparison using Bonferroni test, there was no statistical significance di fference between emotion classification using T7, T8, or in-ear EEG. The two-tailed *p*-values were 0.449 and 0.456, which was over the 0.05 threshold, indicating no significant classifying emotion using in-ear and T7/T8. The box-plot of the classification results are shown in Figures 8–10.

**Figure 8.** Box plot comparison among emotion classification using single channel T7, T8, and in-ear EEGs. Grey areas indicate proportions of classification accuracy above the median. Orange areas indicate proportions of classification accuracy below the median. X indicates the mean accuracy.

**Figure 9.** Box plot comparison among valence classifications using single channel T7, T8, and in-ear EEGs. Grey areas indicate proportions of classification accuracy above the median. Orange areas indicate proportion of classification accuracy below the median. X indicates the mean accuracy.

**Figure 10.** Box plot comparison among arousal classification using single channel T7, T8, and in-ear EEG. Grey areas indicate proportions of classification accuracy above the median. Orange areas indicate proportions of classification accuracy below the medians. X indicates the mean accuracy.

Overall four emotion classification accuracies were approximately 53.72% for in-ear EEG and 58.12% for T7 T8 EEG. Valence classification accuracies were 71.07% and 69.85% for in-ear and T7 T8 EEG, respectively. Arousal classification accuracies were 72.89% and 78.7% for in-ear and T7 T8 EEG, respectively. These comparable accuracies indicate that in-ear EEG has potential for emotion classification as T7 and T8 electrodes do.
