**1. Background**

As societies around the world increasingly face with the issue of aging population, how to take care of these elderly people e ffectively becomes an important challenge. This is true especially for the less-fortunate ones who live alone. In order to ensure their physical and mental well-being and provide emergency assistance, monitoring technology could potentially be part of the solution. Particularly, wearable devices or smart sensors could be employed for e ffective and practical monitoring. Apart of conventional physiological signals, such as heart rate or EKG, that can be monitored to analyze the wearer's health conditions, emotional state is one of the factors which reflects mental states and can greatly impact decision-making [1]. Emotion monitoring could therefore also be used as another piece of information for elderly and remote-patient caring supporting systems.

Emotion itself is very complex [2]. There are di fferent interpretations for the many kinds of emotions, making emotion recognition far from straight forward. For research purposes, several simplified models have been proposed that can be categorized into two approaches; defining basic emotions and using a dimensional model. The most widely used basic emotions are the six basic emotions (i.e., anger, disgust, fear, joy, sadness, and surprise) generally used in facial expression recognition [3]. For the second approach, the common dimensional model is characterized by two main dimensions (i.e., valence and arousal). The valence emotion ranges from negative to positive, whereas the arousal emotion ranges from calm to excited [4]. This model has been used in a number of studies, because it is easier to express an emotion in terms of valence and arousal rather than basic emotions that can be confused by emotion names [5].

For a long time, most emotion recognition studies have focused on using facial expressions and speech. For continuous monitoring purposes, these approaches may not be the most suitable, as they may su ffer from practical issues, such as ambient light and noises. Especially for camera-based facial recognition, the privacy issue is also a concern. Alternatively, physiological signals, such as galvanic skin response (GSR), electrocardiogram (ECG), skin temperature (ST), and electroencephalogram (EEG), which occur continuously and are harder to conceal, have been considered. As emotions are thought to be related with activity in brain areas that direct our attention, motivate our behavior, and determine the significance of what is going on around us, EEG, which is the signal from voltage fluctuations in the brain that are generated continuously at the level of cellular membranes [6], has been especially of interest.

Emotion classification by EEG has been shown to achieve high accuracy [1,7–16]. However, most of those works employed multiple channel EEG headsets. In reality, these conventional multiple channel EEG standard headsets are not suitable for continuous monitoring due to their size and setup di fficultly. Ideally, the EEG recording device used for emotion monitoring should be small, take little time to setup, and be comfortable to wear.

For such requirements, an in-ear EEG which is an EEG recording device introduced by Looney et al. in 2012 [17] could be of interest. Generally, the potential benefits of using an EEG of the in-ear type include the fact that it does not obstruct the visual field. It is also positionally robust, as it is generally fixed inside the ear canals. It is unobtrusive, as it is similar to devices people commonly use, such as earphones, earbuds, and earplugs. It is unlikely to encounter sweat, and also user-friendly for setup and maintenance. Unlike scalp EEG devices, which may require some experienced assistants to help, in-ear EEG devices could be simply put into users' ears. However, an in-ear EEG also has some drawbacks. An in-ear EEG has much fewer electrodes and covers a much smaller area than what the scalp EEG can. So, its application accuracy is expected to be less than that of the scalp EEG.

Our work was aimed at building an in-ear EEG device and evaluating it in terms of signal quality compared to those measured via scalp EEG at comparable positions (i.e., T7 and T8 based on the international 10–20 system [18]). The international 10–20 system is an internationally recognized system for labelling scalp locations for EEG measurement. The T7 position is located above the left ear, while T8 is positioned above the right ear. The prospect of an in-ear EEG usage for emotion classification was also investigated by experiments.

The paper is organized into six sections. Related works are discussed in Section 2. Section 3 describes material selections and system design. Detailed experimental protocols are included in Section 4. Experimental results and analysis are presented in Section 4. Significant findings from the results are discussed in Section 5. Finally, the conclusions are presented in Section 6.

#### **2. Related Work**

#### *2.1. Scalp-Based EEG Emotion Classification*

Scalp-based emotion classification by multi-channel EEG has been an active field of research [1,7– 16]. A review of some of those works can be found in [7]. The majority of the works have focused on signal processing techniques to improve accuracy. For example, Koelstra et al. [19] presented methods for single trial classification using both EEG and peripheral physiological signals. The power spectrum density (PSD) of EEG signals was used as the primary feature. A support vector machine (SVM) classifier was used to classify two levels of valence states and two levels of arousal states. For EEG analysis results, average and maximum classification rates of 55.7% and 67.0% were obtained for arousal and 58.8% and 76.0% for valence. Huang et al. [20] developed an asymmetry spatial pattern (ASP) technique to extract features for an EEG-based emotion recognition algorithm. The system employed k-nearest neighbor (K-NN), naive Bayes (NB), and support vector machine (SVM) methods

for emotion classification. The average accuracy rates for valence and arousal were 66.05% and 82.46%, respectively. We note here that several studies [7,21–23] have targeted the PSD of EEG data as the input features and performed emotion classification by using SVM. Other machine learning techniques, such as naive Bayes, K-NN, LDA, and ANN, have been applied in other studies [9,24–26].

Other areas of focus for scalp-based EEG emotion classification include those in [15,27], which look to develop wearable headband solutions. However, for monitoring purposes, these designs may suffer in conditions such as a warm climate; it might be uncomfortable to wear headband for a long duration due to sweating. Moreover, the sweat could affect the electrode impedance, resulting in noisy signal and inaccurate monitoring

#### *2.2. In-Ear EEG Development*

Originally, an in-ear EEG, which is an EEG recording device introduced by Looney et al. in 2012 [17], was demonstrated to have wearable characteristics that could potentially fulfill monitoring requirements [28]. It is small and could be worn around the ears, and is similar to earplugs or hand-free devices. Since then, research works have focused on areas such as materials; system design, especially in terms of practicality; and the verification of signal quality [17,27,29–31]. For example, Goverdovsky et al. [30] suggested a new prototype called Ear-EEG that consists of a viscoelastic substrate memory foam earplug and conductive cloth electrodes to insure conformance with the ear canal surface for motion artifacts' reduction. Kullkami et al. [27] designed a soft and foldable electrode that can capture the EEG from different outer complex surfaces of the ear and the mastoid using the epidermal electronics with fractal mesh layouts. Recent work by Kappel et al. [31] developed an in-ear EEG with a soft earpiece, which required customized molding to fit individual ears. The prototype showed good signal quality and the potential for long term EEG monitoring.

#### *2.3. In-Ear EEG for Control*

In the field of brain–computer interface, artifacts in EEG signals created through muscle activity, such as eye blinks or other facial expressions, have been studied as a means for controlling external devices. For in-ear implementations, major works in the area include; Matthies et al. [32] which reported an in-ear headset based on a hacked NeuroSky EEG sensor. The prototype utilizes eye winking and ear wiggling for explicit control of the function of a smartphone. Additionally, in 2017, Matthies et al. [33] placed multi electrodes onto a foam earplug to detect 25 facial expressions and head gestures with four different sensing technologies. Five gestures could be detected with accuracy above 90%, and 14 gestures with accuracy above 50%. The prototype was also shown to be robust under practical situations, such as walking.

#### *2.4. In-Ear EEG for Medical and Healthcare Applications*

Medical and healthcare applications have also been a major theme for in-ear EEG research, especially for monitoring purposes [34]. Sleep has been particularly of interest [35,36]. For example, Nguyen et al. [35] proposed a dual channel EEG in the form of an earplug that showed a stable sleep stage classification with an average of 95%+ accuracy. In terms of emotion monitoring, which is closely related to this work, previous work [17,37] showed that an in-ear EEG signal measured was similar to T7 and T8 channels on the 10–20 system [18]. Moreover, one of the previous works also showed that T7 and T8 provided some informative data for emotion classification [7]. These results sugges<sup>t</sup> that an in-ear EEG has the potential to classify emotions, which our work was to investigate.

#### **3. Materials and Methods**

In this work, to achieve the goal of realizing an in-ear EEG, we looked to find answers to these questions:

(1) What type of in-ear EEG should be studied (physically, design-wise, and engineering-wise)?


For (1) we reviewed previous works and built some prototypes to evaluate their suitability. Once we decided upon the solution, we then moved on to verify the quality of measured signals compared to standard measurements to answer (2). It is important to do this before the main experiment as the result should be relatively comparable before we could move on to emotion measurement. To achieve that, we used the mismatch negativity (MMN) to compare auditory ERP measured via our ear EEG with those measured with a conventional headband EEG at T7 and T8 positions. Finally, for emotion classification, we needed reference to benchmark our measured results, so the DEAP dataset was used to calculate the accuracy of emotion classification at T7 and T8. It results were then used as reference for comparing with our own in-ear EEG measurements. All of this is explained in more detail in the following sections.

#### *3.1. In-Ear EEG Development*

#### 3.1.1. Earpieces Selection

Recent research on in-ear EEG devices were studied [17,30,31,37]. There are currently 2 types of in-ear EEG devices; one is a personally customized earpiece, as illustrated in Figure 1, and the other is generic or non-customized. The first type is based on earmolds created from wax impressions, 3D scanning, CAD editing, 3D-printing, and a wiring process, respectively. This type of an in-ear device EEG is robust as it fits completely to the owners' ear canal. However, it is relatively costly. Hence, this type of an in-ear EEG device was not considered in this study, as we would like a generic and low-cost device.

**Figure 1.** The first in-ear EEG prototype introduced by David Looney et al. in 2012 [17].

The generic prototype is usually based on a cylinder-shaped material. The first generic in-ear EEG device was based on a cylinder of silicone, as illustrated in Figure 2 [37]. However, it has a flexibility disadvantage, as it is not guaranteed to fit into all ear canals [30]. The improved prototype used a cylinder-shaped memory foam instead of silicone.

**Figure 2.** Generic in-ear EEG prototype [37]. The left side illustrates a drawing whereas the right side illustrates a model prototype.

Nevertheless, from our test, the in-ear EEG device built from memory foam ear-plugs could not fit into small ear canals. Furthermore, once fit in, it could also gradually slip out of the ear canal. Thus, in this study, the main body of the in-ear EEG device was changed to earphone rubbers, which were tested and found to have high flexibility. Additionally, they come in different sizes which could be properly selected to fit different ear canals, as shown in Figure 3.

**Figure 3.** Different sizes of earphone rubbers.

#### 3.1.2. Electrode Selection

Three different materials were considered and tested for the in-ear EEG device electrodes, a half-sphere shaped silver, aluminum foil, and silver-adhesive fabric. The half-sphere shaped silver is probably one of the most widely-used materials for EEG electrodes. However, according to [30] the electrodes should be as similarly flexible as possible to the earpieces to achieve robust contact. Half-sphere silver is solid and not as flexible as the earphone rubbers. Therefore, the half-sphere silver was not selected. For aluminum foil, although it has low impedance and good flexibility, it could not be easily attached to electrical wires. This is because the aluminum foil is not adhesive to soldering.

The silver-adhesive fabric, which was used with memory foam as in-ear EEG prototype [30], has flexibility similarly to memory foam and earphone rubber. It could also be glued and sewed to the wires without soldering. Therefore, the silver-adhesive fabric was considered suitable material for the electrodes for our in-ear EEG device.

In this study, the size of the fabric was made slightly larger than in the previous study [30] for better contact. The fabric was glued to the ear rubbers, and the shield wires were then sewed to the fabrics. The number of the electrodes was also reduced to one channel per ear as the EEG signals among channels in the same in-ear from the previous studies were very similar [17]. The shield wire was slightly larger and heavier than a normal wire. However, it significantly reduced signal noise. Therefore, it was preferable to standard wire.

Our final prototype of in-ear EEG device is shown in Figure 4. The total material cost per piece is approximately 10 US Dollars. Our in-ear EEG device's impedance was measured to be between 0.05 and 5.5 ohms which was comparable to that of OpenBCI electrodes: one of the commercial EEG electrodes [38].

**Figure 4.** Single channel electrode used in the experiment using earphone rubber and silver-adhesive fabric electrode.

#### *3.2. In-Ear EEG Signal Verification*

After the in-ear EEG devices were assembled, signal verification was performed. Mismatch negativity (MMN) is one of the widely-used methods for EEG verification [39,40]. It was used to verify in-ear EEG signals in the previous study [41]. Hence, it was also applied in our work. MMN is an experiment which observes the auditory event-related potential (ERP). ERP is a subject's EEG signal response to an unexpected change of sensory stimulation.

Our MMN experiment started by playing a short beep tone repeatedly until the subject was familiar to the tone. Unexpected mismatch tones were then inserted among the familiar tone. Unexpected mismatch tones could have a change of frequency (lower or higher), duration (unusually longer beep duration), intensity (unusually louder or lighter), or phase. The mismatch tone, if acknowledged, will provide an ERP response as a negative peak. The mismatch responses usually give a negative peak between 90 and 250 milliseconds after the beep [40]. The ERP latency may be varied according to personal musical experience [42].

The MMN experiment parameters in this study were set according to the previous study [40]. A combination of three pure tonal frequencies: 500, 1000, and 1500 Hz lasting for 75 milliseconds, were used as a standard tone, whereas two types of mismatch tones were applied. The first type was frequency mismatch containing 10% lower or higher pitch randomly applied to each frequency. The other type was a duration mismatch tone which lasted for 100 milliseconds, 25 milliseconds longer than the standard tone. The standard tone was beeped 15 times in order to make the subject familiar with the tone, before the mismatch tones were inserted. Mismatched tones arrived at the probability of 0.5, but no consecutive mismatch tones were allowed.

The tones were played through an earphone. The in-ear EEG device was inserted to the right ear while the earphone was inserted to the left ear. The ground electrode was placed on the forehead and the reference electrode was placed on the right cheek, as suggested by [43]. An OpenBCI's electrode was also placed at T8 as a comparison electrode. A Butterworth filter was used to notch 50 Hz powerline noise. It was also applied as a bandpass to filter the EEG signal between 2 and 30 Hz. The signal correlation between T8 and in-ear EEG was also calculated.

#### *3.3. Emotion Model Emotion Stimuli*

The valence and arousal emotion model [4], as in Figure 5, was used in this research, as it is a widely used simplified emotion model. Four emotions (happiness, calmness, sadness, and fear) will be classified according to the quadrants, respectively.

**Figure 5.** Valence and arousal model. Anger and fear have high valence and arousal. Happiness and excitement have high arousal and valence. Sadness and depression have low arousal and valence. Relaxation and pleasure have low arousal but high valence [4].

The International Affective Picture System (IAPS) [44], and the Geneva Affective Picture Database (GAPED) [45] were used as visual emotional stimuli. IAPS was the most widely used among previous research [1]. IAPS was developed at the Center for the Study of Emotion and Attention, University of Florida, by Lang, et al. [44]. IAPS pictures were standardized, and publicly available for use in emotional stimulation. The emotions elicited were based on two primary dimensions, which were valence and arousal. Valence ranged from unpleasant to pleasant, while arousal ranged from calm to excited. Every picture has valence and arousal rating from the scale 1 (lowest) to 9 (highest). However, IAPS contains fewer numbers of pictures stimulating low valence and low arousal than needed, so additional pictures from GAPED were used.

The GAPED database was developed by Dan-Glauser, et al. at the University of Geneva [45]. It was intended to provide additional pictures to a limited number of IAPS for experimental researchers. GAPED provided a 730 picture database for emotion stimulation, which was also rated based on valence–arousal parameters as used in IAPS [44]. Moreover, four classical music pieces from auditory emotional research [46] were also applied as stimuli. The four musical pieces were also chosen based on the valence–arousal model, which corresponded to the IAPS and GAPED pictures.
