User Affect Elicitation with a Socially Emotional Robot

Shao, Mingyang; Snyder, Matt; Nejat, Goldie; Benhabib, Beno

doi:10.3390/robotics9020044

Open AccessArticle

User Affect Elicitation with a Socially Emotional Robot

¹

Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON M5S 3G8, Canada

²

Yee Hong Centre for Geriatric Care, Mississauga, ON L5V 2X5, Canada

^*

Author to whom correspondence should be addressed.

Robotics 2020, 9(2), 44; https://doi.org/10.3390/robotics9020044

Submission received: 6 May 2020 / Revised: 1 June 2020 / Accepted: 1 June 2020 / Published: 3 June 2020

(This article belongs to the Special Issue Feature Papers 2020)

Download

Browse Figures

Versions Notes

Abstract

:

To effectively communicate with people, social robots must be capable of detecting, interpreting, and responding to human affect during human–robot interactions (HRIs). In order to accurately detect user affect during HRIs, affect elicitation techniques need to be developed to create and train appropriate affect detection models. In this paper, we present such a novel affect elicitation and detection method for social robots in HRIs. Non-verbal emotional behaviors of the social robot were designed to elicit user affect, which was directly measured through electroencephalography (EEG) signals. HRI experiments with both younger and older adults were conducted to evaluate our affect elicitation technique and compare the two types of affect detection models we developed and trained utilizing multilayer perceptron neural networks (NNs) and support vector machines (SVMs). The results showed that; on average, the self-reported valence and arousal were consistent with the intended elicited affect. Furthermore, it was also noted that the EEG data obtained could be used to train affect detection models with the NN models achieving higher classification rates

Keywords:

human–robot interaction; affective computing; assistive and social robotics

1. Introduction

There exists a growing number of social robots being integrated into our daily lives as they can assist and extend human capabilities in human-centered environments such as homes, hospitals, and workplaces [1]. To effectively assist and interact with humans during human–robot interactions (HRIs), these robots are expected to have social intelligence and engage in bi-directional communications [2]. For robots to communicate with people, they must be able to recognize, interpret, and respond to human affect, which can be used to communicate people’s feelings, emotions, thoughts and intent [3]. Robots that can interpret human affect can promote more effective and engaging HRIs, which can lead to better acceptance from their users [4].

Affective computing enables robots to detect and interpret different modes of human affect using a variety of rule-based or learning techniques [5]. In order to accurately recognize and classify user affect, affect elicitation techniques have been developed for creating and training affect detection models [6].

The majority of existing elicitation techniques often utilize stimuli that users view (e.g., looking at images, watching videos containing certain emotions) [7,8] or engage users in specific human–human scenarios (e.g., social interactions between people) [9] in order to elicit a certain human affect. However, these techniques cannot always elicit the affect that users would experience, specifically, during HRIs, as it has been found that such affect is directly related to the interaction with the robot itself [9,10,11]. For example, users may experience higher affective arousal when interacting with a robot compared with a human for the same activity due to the unfamiliarity and unpredictability of the robot [10]. Therefore, to better interpret and respond to users during HRIs, it is important for affect detection models to be trained and validated with user affect that occurs during HRIs [11].

To date, a handful of affect elicitation techniques have considered the direct use of robots to obtain a user’s affective expressions [11,12,13,14,15,16]. The majority aim to elicit and detect physical affective expressions such as body movements and facial expressions [11,12,13,15,16,17,18,19]. However, these techniques are difficult to use with individuals who have physical impairments. For example, older adults have age-related functional decline in facial expression generation [20], body movements, and postures [21]. As there is a growing interest in developing social robots as assistants for vulnerable populations, a robot’s affect awareness should consider the elicitation and detection of affect with these potential users in order to provide better interaction and assistance [22].

Our research focuses on developing socially assistive robots to facilitate and help people, including older adults and those living with cognitive impairments, with activities of daily living such as meal assistance [23,24], dressing [25,26], exercising [7,27], and cognitive stimulating interventions including playing Bingo and trivia games [28,29,30]. During such assistive interactions, it is important for these robots to not only detect the intent of users but also their affect with respect to the interaction, in order to determine and adapt their own assistive behaviors to each user. In a long-term care environment, where most residents have cognitive impairments, a robot’s capacity for social intelligence becomes important. In this paper, we present the development of a novel autonomous affect elicitation and detection methodology for social robots engaging in HRIs. Our approach explicitly uses the emotional non-verbal behavior of the social robot to elicit user affect. We determine user valence and arousal directly through electroencephalography (EEG) signals. EEG signals are largely involuntarily activated by the central nervous system (CNS) and the autonomic nervous system (ANS) of the human body [31]. They have been successfully used to detect the affective states and cognitive states of people in a number of different age groups [32]. To the authors’ knowledge, we are the first to recognize user affect elicited from interacting with a social robot in a social HRI scenario through the use of EEG signals.

2. Related Work on Affect Elicitation Using Robots

As previously mentioned, a handful of robots have been used for affect elicitation during HRI scenarios [11,12,13,14,15,16]. Such scenarios include a robot playing interactive games with people [11,12,15], sharing the same workspace with people while assisting with daily activities [14,17,18,19], presenting lectures to a crowd [16], and providing companionship [13]. User affect during these interactions has either been determined using expert-coded affect based on verbal and non-verbal communication [11,12,13,17,18,19], self-reported scales [14,15,19], or by using both in parallel [16].

2.1. Coded Affect

In [11], the cat-like robot iCat was used to play chess games on an electronic chessboard with children. The robot displayed a happy facial expression when a player made a bad move and a sad facial expression when the player made a good move. During the interactions, videos of the frontal and lateral views of the children were recorded. The videos were annotated manually by three coders to determine affective postural expressions. These expressions were used to train a recognition model for the level of engagement of a player comparing different learning-based classifiers.

In [12], two child–robot interaction scenarios were designed utilizing a teleoperated NAO robot to elicit and collect spontaneous emotional expressions from the speech, facial expressions, and body language of children. The first scenario involved a child and the robot playing Snakes and Ladders on a computer. Further, 2D cameras and Kinect sensors placed in front of the children were used to record the body, face, and audio information. The robot would display positive or negative body gestures based on the child’s performance. In the second scenario, each child watched movie clips with the robot that elicited different emotions (anger, disgust, fear, happiness, sadness, or surprise). After each movie clip, the robot expressed its own emotion using affective gibberish speech. Its emotion was either the same or contradictory to the movie clip. The children were required to rate the robot’s emotion based on valence and arousal using the Self-Assessment Manikin (SAM) scale [33]. All captured data from both scenarios were annotated manually by four raters using the 2D valence–arousal scale.

In [13], a toy-like robot Mof-mof was used to elicit four different emotional facial expressions (happy, surprised, angry, and sad) by displaying varying robot actions (e.g., hopping, bending) and speech patterns (e.g., “I feel good today”) based on a user’s situation (e.g., how busy they were, their current postures). The user’s facial expressions were captured by a camera and detected using the OKAO Vision facial expression recognition software [34]. Each user entered their current situation on a computer and then the robot would select an action and speech that was expected to elicit a specific facial expression based on a multilayer perceptron neural network robot behavior model. This model was previously trained in [35].

2.2. Self-Reported Affect

In [14], a robot manipulator CRS A460 was used to elicit affect by performing motion trajectories using different velocities and accelerations. Physiological signals including heart rate, perspiration rate, and electromyogram (EMG) were obtained in addition to users’ subjective responses on their perceived level of anxiety and calmness using a 5-point Likert scale. The level of valence and arousal were extracted from the user’s level of anxiety and calmness. These were then used as labels for the corresponding physiological data. Three hidden Markov models (HMM) were trained to detect valence and arousal.

In [15], two Adept Viper 6 degrees-of-freedom (DOF) robot arms with two-finger grippers were used to play the Tower of Hanoi game with users. A Kinect sensor was used to monitor the game by tracking the number of moves made by the users and the robots, and a surveillance camera was used to monitor the interaction in case of an emergency. The players played the game multiple times by themselves, and with a human or a robot collaborator. After each game, they rated their own elicited emotional experience using the Geneva Emotion Wheel (GEW) [36].

2.3. Use of Both Self-Reported and Coded Affect

In [16], a NAO robot was used to present lectures to a crowd of people in order to investigate how robot moods influence the affect of an audience. The robot displayed different arm gestures to convey positive or negative valence during the lectures. Two 30-min lecture scenarios were conducted using either of these conditions for the entire lecture. Participants rated their affect using the SAM scale before, in the middle of, and after the lecture. Each scenario was video recorded, and the videos were annotated by two coders to assess participant valence and arousal based on verbal and non-verbal reactions (e.g., laughter, applause) on a 9-point Likert scale. The self-reported results and the coded affect were analyzed separately. They showed agreement on the elicited arousal, however, the coded affect had higher positive valence in the positive session than the self-reported affect.

The aforementioned robots have been used for affect elicitation during HRI scenarios using different embodiments such as manipulator arms [14,15], animal/creature-like robots [11,13], and humanoid robots [12,16]. It has been found that the manipulator and animal/creature type robots can have limited social embodiment which may affect their ability to partake in certain social roles and perform a variety of social behaviors [37]. A human-like social embodiment has been shown to make it easier for a robot to follow human social norms which have resulted in more engaging and effective social HRIs [37]. The majority of the robots engaged in the abovementioned “social” HRI scenarios used only physical affective expressions such as body movements and facial expressions to determine user affect. However, these expressions are not always available in HRI scenarios. For example, exercise facilitation requires users to perform physical movements that may restrain them from displaying body gestures, and a user’s facial expressions can be perturbed due to the increase in effort and muscle fatigue from physical activities [7]. Furthermore, some approaches have only focused on affect elicitation (e.g., [15,16]) without considering the recognition of the user affect that is being elicited.

In this paper, we propose the development of an affect elicitation approach that can elicit user affect with a human-like social robot in order to directly capture the affect that individuals feel during HRI scenarios. In order to consider different populations and HRI activities, we uniquely obtain EEG signals for the detection of user valence and arousal during social HRIs. As EEG measures the electrical activity of the brain, our approach can be used by a robot even when users are participating in physical activities.

3. A User Affect Elicitation Methodology Using a Social Robot

Our affect elicitation methodology utilizes a social robot to directly provide stimuli for eliciting the valence and arousal of a user. We utilize the Pepper robot to display a combination of affective body movements to music in order to induce different user affect. In general, body movements can be used to accurately express distinctive affect [38]. Furthermore, observing affective body movements can activate the mirror neuron network of a person which in turn can produce a similar affect to that observed [39]. Music has also been shown to effectively invoke affect by triggering hormonal and autonomic responses of a person through the direct use of its structural features (e.g., intensity, tempo, and mode) [40]. For example, music that is fast in tempo and composed in the major mode can induce positive valence and high arousal, while music that is written in the minor mode with a slow tempo can induce negative valence and low arousal [40].

As music has a direct impact on brain activation, it is effective in eliciting affect that can be recognized through physiological signals such as EEG [41]. Music has been used as a common elicitation technique for the training of affect detection models [6]. It has been used both independently on its own [42], or combined with other modes such as video [43], or body movements [44]. Both music and gestures/body movements have been validated independently for the use of affect detection [45,46], which motivated their use in our work. Furthermore, they have also been combined together to successfully determine affect [44,47,48,49]. As people often relate particular movement features with music that matches the movements [50], this combination can generate stronger affective responses, including physiological responses, than when using each mode alone [44]. Therefore, combining music with body movements with congruent affective information, as we propose herein for a social robot, has the potential to induce distinct affect in users.

Our proposed affect elicitation and detection methodology is presented in Figure 1. EEG signals are used to measure the affect of users during HRIs, and self-assessments are then used to label the corresponding EEG signals to develop and train an affect detection model for determining the user’s level of valence and arousal. Each of the two sub-systems within our methodology are discussed in detail below.

3.1. Affect Elicitation

We have designed robot body movements which utilize different combinations of upper body, shoulder, head, arm, and hand movements for the robot to express two types of affect: (1) positive valence and high arousal, and (2) negative valence and low arousal. These affect types were chosen as their respective movement dynamics (e.g., speed) and music (e.g., tempo) share a structure that is emotionally relevant to people [44]. In turn, they also can produce stronger physiological responses in the perceivers [44].

Herein, the design of the robot’s body movements is adapted from [45], which identifies distinct associations between certain human body movements and affect. Our designed positive valence and high arousal movements are composed of a series of expansive movements with high movement activities and/or high movement dynamics, Figure 2a. The negative valence and low arousal movements are unexpansive with low movement activity and/or low movement dynamics, Figure 2b. These two affect elicitation stimuli are used for eliciting reciprocal affect in users.

The affective body movements are coordinated with music chosen from a publicly validated dataset that contains 1000 licensed music excerpts from the Free Music Archive (FMA) that were specifically designed for affect elicitation [46]. Each excerpt is 45 s long. A minimum of 10 annotators from 10 different countries were used to rate the level of valence and arousal for each of these excerpts [46]. For the robot stimuli, we selected five music excerpts for each affect type to match with the intended affect of the robot movements accordingly, Table 1. Each excerpt contains only instrumental arrangements, thus excluding those with vocals to prevent the potential influence of a language barrier on the user’s experience. A video of our designed robot affect elicitation stimuli can be found here (https://youtu.be/UaoPb6_uOeE) on our YouTube Channel.

3.2. Affect Detection

Elicited user affect is measured by EEG signals. They are labeled based on self-reported perceived affect in the (a) positive valence and high arousal (PH) and (b) negative valence and low arousal (NL) sessions. The labeled data are used to train the affect detection models in order to detect valence and arousal during HRIs.

3.2.1. Physiological Responses

Physiological signals can be more reliable than physical signals (e.g., facial expressions) as they are not easily controlled by people in order to hide or manipulate their affect [31]. Common physiological modes used in affect detection include: (1) cardio activity, (2) skin conductivity, (3) blood volume pulse, (4) surface electromyography (EMG), (5) EEG, and (6) respiration [51]. Only EEG and EMG can be used for detecting both valence and arousal while the other modes are only used to measure arousal [51]. However, EMG can only be used to measure user affect when there is no muscle contraction or movements [52], which can be impractical for users engaging in HRIs. On the other hand, EEG measures the brain electrical activities which are less affected by such movements. Therefore, EEG signals are used in our work to measure a person’s valence and arousal.

The EEG headband we use is the InteraXon Muse 2016, a low-cost four-channel dry electrode EEG sensor with a sampling rate of 256 Hz [53]. The sensor measures the electrical signals from the four electrode locations at TP9 (above the left ear), AF7 (left side of the forehead), AF8 (right side of the forehead), and TP10 (above the right ear), described using the International 10–20 system [53], as shown in Figure 3.

EEG Feature Extraction

Two types of EEG frequency domain features, namely the power spectral density (PSD) feature and the frontal asymmetry feature, are extracted from the EEG data. Both types of features can be used in real-time valence and arousal detection [54,55]. The EEG data are processed using the Muse LSL package [56].

The EEG signal is decomposed using fast Fourier transform (FFT) through a 1 s sliding window with an overlap of 80% to extract the PSD. This procedure is implemented to reduce spectral leakage and minimize data loss to extract the PSD [54].

Examples of the EEG raw signal in the time domain for both the PH and NL sessions are presented in Figure 4a,b. Examples of five consecutive sliding windows of the PSD in the frequency domain for both sessions are presented in Figure 4c,d. The PSD features are acquired from each electrode location (TP9, AF7, AF8, and TP10) in four distinct frequency bands: θ (4–8 Hz), α (8–13 Hz), β (13–30 Hz), and γ (30–40 Hz) [55]. These features have been commonly used as the input for classifying affective valence and arousal using EEG signals [31,32,55,57]. The θ band power is often correlated with relaxation [54].

An increase in the frontal θ power (e.g., in AF7 and AF8 in Figure 3 and Figure 5a) can be observed with a lower arousal stimulus [54]. In addition, a greater θ band power in the right hemisphere (e.g., in AF8 and TP10) can be observed when there are negative stimuli [58]. The α band power is related to the relaxed state of the mind [54]. An increase in the α band power in the right hemisphere (e.g., in AF8 and TP10) occurs also when viewing negative stimuli [54]. A decrease in the frontal α band power (e.g., in AF7 and AF8) can be observed when someone is exposed to high-arousal stimuli [59].

The β band is associated with the sensory-motor system and an increase in the β band power has been found when someone is exposed to positive stimuli [57]. Furthermore, an increase in the frontal β band power (e.g., in AF7 and AF8) has been observed for viewing high-arousal stimuli [60]. The γ band power has been associated with the integration of information and an increase in the γ band power has been found for viewing positive valence stimuli as well as high-arousal stimuli [57]. An example of these PSD features for the PH session and NL session based on the signals shown in Figure 4 are presented in Figure 5a.

Previous studies have shown that the valence and arousal of a person are correlated to the frontal EEG asymmetry, which refers to the power difference between the left and right frontal hemispheres of the brain within the

α

and β frequency bands [54,57,61]. More specifically, a greater frontal left hemisphere activity is associated with positive valence, while a greater frontal right hemisphere activity is associated with negative valence [61]. In addition, higher arousal is characterized by a higher β activity and lower

α

activity on the frontal hemispheres of the brain [61]. Frontal EEG asymmetry features are measured by the ratio of the

α

and β bands in order to determine valence and arousal [55]. Four different valence and arousal frontal asymmetry features are adapted from [55]. These features are computed as valence, v₁ to v₄, Equations (1)–(4), and arousal,

a_{1}

to

a_{4}

, Equations (5)–(8):

v_{1} = \frac{α_{A F 8}}{β_{A F 8}} - \frac{α_{A F 7}}{β_{A F 7}}

(1)

v_{2} = l n (α_{A F 7}) - l n (α_{A F 8})

(2)

v_{3} = \frac{β_{A F 7}}{α_{A F 7}} - \frac{β_{A F 8}}{α_{A F 8}}

(3)

v_{4} = α_{A F 8} - α_{A F 7}

(4)

a_{1} = \frac{α_{A F 7} + α_{A F 8}}{β_{A F 7} + β_{A F 8}}

(5)

a_{2} = - (l n (α_{A F 7}) + l n (α_{A F 8}))

(6)

a_{3} = l o g_{2} (\frac{β_{A F 7} + β_{A F 8}}{α_{A F 7} + α_{A F 8}})

(7)

a_{4} = \frac{β_{A F 7} + β_{A F 8}}{α_{A F 7} + α_{A F 8}}

(8)

where

α_{A F 7}

,

α_{A F 8}

,

β_{A F 7},

and

β_{A F 8}

are the α and β band powers measured at the AF7 and AF8 locations. An example of these features, for the PH and NL sessions, based on the signals shown in Figure 4, is presented in Figure 5b.

Based on the results presented in Figure 5a, the NL session had a higher θ band power which indicates that the NL session induced more negative valence and lower arousal compared with the PH session [54,58]. The higher α band power in the NL session indicates the NL session elicited more negative valence than the PH session [54]. The lower α band power indicates the PH session induced higher arousal than the NL session [59]. The PH session had a higher β and γ band power which resulted in more positive valence and higher arousal compared with the NL session [57].

Regarding the average frontal EEG asymmetry based on the α and β band powers presented in Figure 5b, the PH session had higher v₂ and v₃ and lower v₁ and v₄ compared with the NL session. This indicates that the user in the PH session experienced more positive valence compared with the NL session [55,61]. On the other hand, the PH session had higher a₂–a₄ and lower a₁ compared with the NL session. This indicates that the user experienced higher arousal in the PH session [55,61].

In total, 20 features are utilized for both valence or arousal detection, which include 16 PSD features for the four frequency bands, θ, α, β, and γ, measured at the locations TP₉, AF₇, AF₈, and TP₁₀ and four frontal EEG asymmetry features obtained from Equations (1)–(8).

3.2.2. Self-Assessment

For training, each user completed a self-assessment questionnaire to report their perceived affect after viewing each of the stimuli. We utilize the Self-Assessment Manikin (SAM) [33] pictorial assessment technique to measure self-reported valence and arousal to the robot stimuli. For both valence and arousal, a 5-point Likert scale ranging from −2 (highly negative valence or very low arousal) to +2 (highly positive valence or very high arousal) with the corresponding SAM pictorial representation is presented to each user. The self-assessed affect is then used to label the corresponding EEG signals to develop our affect detection models.

3.2.3. Affect Detection Model

We investigate two learning-based models for our affect detection module. Namely, a three-hidden layer multilayer perceptron neural network (NN) model and a support vector machine (SVM) model with the radial basis function kernel (RBF) are considered and compared using the Scikit-Learn toolbox [62]. These two models are considered as they are the most commonly used learning-based models for affect classification [32]. Each model consists of a valence sub-model and arousal sub-model. As previously mentioned, they are trained using the labeled EEG signals.

4. Experiments

A user study was conducted to evaluate the proposed affect elicitation and detection methodology. We were able to recruit nineteen participants (µ = 45.58, σ = 30.95) for two one-on-one interactions with Pepper. This participant size is comparable to other affect elicitation and detection studies which have 4–22 participants, e.g., [11,12,13]. Participants consisted of individuals from two different age groups: (1) 13 younger adults (YA) between the ages of 22 and 38, mainly university students (12 male and 1 female), and (2) 6 older adults (OA) between 81 and 96 years old from a local long-term care facility (1 male and 5 female). All subjects gave their informed consent for inclusion before they participated in the study. The study received approval from the University of Toronto Ethics Committee.

Our experiment follows the standard comparison of different emotions in the same context within-subject experiment design, where each participant is exposed to two different stimuli under the same experiment conditions [63]. This experiment design has been commonly used in numerous affect elicitation studies, where EEG data are collected during the applied stimuli and used to develop affect detection models [64,65,66]. The experiment took place in an isolated quiet room. Participants were seated in front of the robot and wore the EEG headband, Figure 6. Both stimuli consisting of the robot movements set to music were presented to each participant to elicit either positive valence and high arousal (PH session) or negative valence and low arousal (NL session). The stimuli were presented to the users in a random order. Prior to each session, participants were given 2 min to relax in silence, where no stimulus was presented. Each session was approximately 4 min in duration followed by a 5 min break between each session. Participants were asked to report their perceived valence and arousal levels using the 5-point (+2 to −2) SAM scale during the break, where negative ratings were considered as negative valence or low arousal, 0 was considered as neutral, and positive ratings were considered as positive valence or high arousal.

4.1. Affect Elicitation Results

The self-reported results for all the participants as well as for each age group per each session are presented in both Figure 7 and Table 2. On average, the participants reported positive valence and high arousal for the PH session and negative valence and low arousal for the NL session, Figure 7a,b, which was consistent with our intended elicited affect for each session.

With respect to the two age groups, for valence, 84.62% of YA self-reported positive valence for the PH session as well as negative valence for the NL session. For OA, all of the participants self-rated a +2 in the PH session. However, there was less consensus for this group in the NL session, where 50% of them reported negative valence, 33.33% reported positive, and 16.67% reported neutral valence. The data from these 84.62% of the YA and 50% of the OA (i.e., 14 participants in total) were used to train our valence detection models. One YA and two OA perceived positive valence for both sessions as they stated that they were intrigued/amazed by the robot’s performance. Compared with the YA, the OA, in general, had higher perceived valence in both sessions. This may be due to the fact that they had less exposure to and experience with robots, and these individuals were more excited to interact with a robot for the first time.

With respect to arousal, on average, both the YA (61.54%) and OA (66.67%) self-reported high arousal for the PH session, Figure 7c. The YA also self-reported low arousal (76.92%) during the NL session, while all of the OA reported neutral arousal, Figure 7d. The data from these 61.54% YA and 66.67% OA (i.e., 12 participants in total), who reported higher arousal in the PH session than the NL session, were used to develop our arousal detection model. The same YA, who perceived positive valence for both sessions, also reported high arousal for both sessions. In addition, three YA rated low arousal for both sessions, however, they all commented on the robot’s ability to display complex human-like movements. Two OA perceived neutral arousal for both the PH and NL sessions. The first OA had positive valence for both sessions. This participant found the robot’s movements in both sessions to be pleasant. The second OA had positive valence for the PH session, and neutral valence for the NL session. This participant stated that they did recognize the robot was sad, therefore resulting in the lower reported valence, however, the difference in body movements and music did not affect their arousal.

Regarding the OA who reported neutral valence during the NL session, we postulate that this could be a result of older adults having a negative-to-neutral shift in perceiving negative stimuli, namely they tend to experience negative stimuli as neutral more frequently than younger adults [67]. This can be a result of age-related reduced amygdala activity to negative stimuli, especially negative valence and low-arousal stimuli, where the amygdala is the region of the brain associated with processing and experiencing emotions [67]. In addition, the reason why the OA did not self-report lower than neutral arousal may also be due to age-related changes in the intensity of the affective stimuli to which the amygdala is more receptive [67]. In general, when high-arousing stimuli is observed, OA and YA show similar levels of amygdala activity, however, OA have decreased amygdala activity compared with YA when low-arousing stimuli is observed, such that often they do not experience the low arousal that YA may experience [67].

4.2. Affect Detection Models

As we only used the data of users whose perceived affect matched with the intended elicited affect in both sessions, the EEG data obtained from 14 participants were used for developing our valence detection sub-model and from 12 participants for the arousal detection sub-model. Each participant had two sessions (i.e., one positive and one negative session), and the EEG data were recorded for an average of 223 secs for each session. Average band powers were sampled every second to compute the features which resulted in 6254 samples of features from the 14 participants for valence detection and 5424 samples from the 12 participants for arousal detection. For both the NN and SVM models, we labeled our samples in two classes for valence (positive and negative valence) and for arousal (high and low arousal) such that the same number of samples was used for each class (i.e., 3127 samples for both positive and negative valence classes, and 2712 samples for both high and low arousal classes).

We first separate our data into a training set and a testing set (approximately 75%/25% split between users), and the testing set contains the data of users that are not in the training set. Both the training and testing sets had a combination of YA and OA data. We conducted both subject-dependent ten-fold and subject independent leave-one-out (LOO) cross-validations on the training set. Then another subject independent evaluation was performed on the testing set consisting of multiple new users to further assess how well the classification models perform on unknown subjects to the system.

The overall classification results are presented in Table 3. For the ten-fold cross-validation, the classification rates for detecting valence were 71.9% and 70.1%, and for arousal were 70.6% and 69.5% using the NN and SVM detection models, respectively. The LOO cross-validation classification rates for valence were 63.7% and 61.8% for the NN and SVM models and for arousal were 63.3% and 61.6% for NN and SVM. Since the ten-fold cross-validation is subject-dependent while the LOO cross-validation is subject independent, the classification rates from the ten-fold are expected to be higher [31,68]. With respect to the testing set, the classification rates were 63.3% and 62.4% for valence and 62.6% and 61.2% for arousal for the NN and SVM models. The classification results from both the LOO cross-validation and subject-independent testing are comparable to each other. Furthermore, they are also comparable to other non-HRI affect detection techniques used with subject-independent EEG data, namely 57.6–62.5% for valence and 55.7–62.5% for arousal [43,66,68,69,70,71,72]. Based on the results, NN achieved higher classification rates for detecting both valence and arousal with both the training and testing sets. To further evaluate the ability of each learning-based model to distinguish between classes, receiver operating characteristics (ROC) curves [73] were plotted for each model on the testing set, Figure 8. The area under the curve (AUC) of an ROC curve represents how the model is able to distinguish between different classes [73]. For the NN model, the AUCs were 0.73 and 0.74 for detecting valence and arousal, respectively. For the SVM model, the AUCs were 0.72 and 0.71 for valence and arousal detection. Therefore, the NN model achieved a slightly higher AUC than the SVM model for both valence and arousal detection, and was effectively able to distinguish between positive and negative valence as well as high and low arousal.

5. Conclusions

In this paper, we present the development of a novel affect elicitation and detection methodology for socially assistive robots. The affect elicitation stimuli were uniquely designed using non-verbal emotional behaviors of the robot set to affective music to elicit positive valence and high arousal, and negative valence and low arousal. User affect was measured through EEG signals and used to train two learning-based affect detection models. HRI experiments consisting of two different age groups showed that the majority of participants were able to successfully perceive and reciprocate the affect as intended. Furthermore, a three-hidden layer multilayer perceptron neural network model achieved better classification results for both valence and arousal detection than a support vector machine model. Our future work will focus on using our affect detection model during various assistive HRI tasks to detect and respond to user affect in order to promote more engaging HRIs.

Author Contributions

Individual contributions from the authors of this research paper are as follows: conceptualization, M.S. (Mingyang Shao), and G.N.; methodology, M.S. (Mingyang Shao), and G.N.; software, M.S. (Mingyang Shao); validation, M.S. (Mingyang Shao); formal analysis, M.S. (Mingyang Shao); investigation, M.S. (Mingyang Shao); resources, M.S. (Mingyang Shao), M.S. (Matt Snyder), G.N., and B.B.; data curation, M.S. (Mingyang Shao); writing—original draft preparation, M.S. (Mingyang Shao); writing—review and editing, M.S. (Mingyang Shao), M.S. (Matt Snyder), G.N., and B.B.; visualization, M.S. (Mingyang Shao); supervision, M.S. (Matt Snyder), G.N. and B.B.; project administration, M.S. (Mingyang Shao), M.S. (Matt Snyder), G.N., and B.B.; funding acquisition, G.N. and B.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research is supported by AGE-WELL Inc., the Natural Sciences and Engineering Council of Canada (NSERC), the Canadian Institute for Advanced Research (CIFAR), and the Canada Research Chairs Program.

Acknowledgments

The authors would like to thank our partner long-term care facility, the Yee Hong Centre for Geriatric Care in Mississauga, and our experiment participants. We would also like to thank Michael Pham-Hung and Silas Franco Dos Reis Alves for their assistance with this work.

Conflicts of Interest

The authors declare no conflict of interest.

References

Fortunati, L.; Esposito, A.; Lugano, G. Introduction to the Special Issue “Beyond Industrial Robotics: Social Robots Entering Public and Domestic Spheres”. Inf. Soc. 2015, 31, 229–236. [Google Scholar] [CrossRef]
Saunderson, S.; Nejat, G. How Robots Influence Humans: A Survey of Nonverbal Communication in Social Human-Robot Interaction. Int. J. Soc. Robot. Manuscr. 2019, 11, 1–34. [Google Scholar] [CrossRef]
McColl, D.; Hong, A.; Hatakeyama, N.; Nejat, G.; Benhabib, B. A Survey of Autonomous Human Affect Detection Methods for Social Robots Engaged in Natural HRI. J. Intell. Robot. Syst. 2016, 82, 101–133. [Google Scholar] [CrossRef]
Ficocelli, M.; Terao, J.; Nejat, G. Promoting Interactions Between Humans and Robots Using Robotic Emotional Behavior. IEEE Trans. Cybern. 2016, 46, 2911–2923. [Google Scholar] [CrossRef] [PubMed]
Cambria, E. Affective Computing and Sentiment Analysis. IEEE Intell. Syst. 2016, 31, 102–107. [Google Scholar] [CrossRef]
Kory, J.; Mello, S.D. Affect Elicitation for Affective Computing. In Oxford Handbook of Affective Computing; Calvo, R., D’Mello, S., Gratch, J., Kappas, A., Eds.; Oxford University Press: New York, NY, USA, 2014; pp. 1–22. [Google Scholar]
Shao, M.; Franco, S.F.D.R.; Ismail, O.; Zhang, X.; Nejat, G.; Benhabib, B. You are doing great! Only one Rep left: An affect-aware social robot for exercising. In Proceedings of the 2019 IEEE International Conference on Systems, Man and Cybernetics, Bari, Italy, 6–9 October 2019; pp. 3791–3797. [Google Scholar]
Schaaff, K.; Schultz, T. Towards an EEG-based emotion recognizer for humanoid robots. In Proceedings of the 18th IEEE International Symposium on Robot and Human Interactive Communication, Toyama, Japan, 27 September–2 October 2009; pp. 792–796. [Google Scholar]
Castellano, G.; Leite, I.; Pereira, A.; Martinho, C.; Paiva, A.; Mcowan, P.W. Affect Recognition for Interactive Companions: Challenges and Design in Real World Scenarios. J. Multimodal User Interfaces 2009, 3, 89–98. [Google Scholar] [CrossRef]
Riether, N.; Hegel, F.; Wrede, B.; Horstmann, G. Social facilitation with social robots? In Proceedings of the 2012 7th ACM/IEEE International Conference on Human-Robot Interaction, Boston, MA, USA, 5–8 March 2012; pp. 41–47. [Google Scholar]
Sanghvi, J.; Castellano, G.; Leite, I.; Pereira, A.; Mcowan, P.W.; Paiva, A. Automatic analysis of affective postures and body motion to detect engagement with a game companion categories and subject descriptors. In Proceedings of the 6th International Conference on Human Robot Interaction, Lausanne, Switzerland, 6–9 March 2011; pp. 305–312. [Google Scholar]
Wang, W.; Athanasopoulos, G.; Yilmazyildiz, S.; Patsis, G.; Enescu, V.; Sahli, H.; Verhelst, W.; Hiolle, A.; Lewis, M.; Cañamero, L. Natural emotion elicitation for emotion modeling in child-robot interactions. In Proceedings of the 4th Workshop on Child Computer Interaction, Singapore, 19 September 2014; pp. 51–56. [Google Scholar]
Kumagai, K.; Hayashi, K.; Mizuuchi, I. Elicitation of specific facial expression by robot’s action. In Proceedings of the International Conference on Advanced Mechatronics, Tokyo, Japan, 5–8 December 2015; pp. 53–54. [Google Scholar]
Kulic, D.; Croft, E.A. Affective State Estimation for Human-Robot Interaction. IEEE Trans. Robot. 2007, 23, 991–1000. [Google Scholar] [CrossRef]
Jercic, P.; Wen, W.; Hagelbäck, J.; Sundstedt, V. The Effect of Emotions and Social Behavior on Performance in a Collaborative Serious Game Between Humans and Autonomous Robots. Int. J. Soc. Robot. 2018, 10, 115–129. [Google Scholar] [CrossRef] [Green Version]
Xu, J.; Broekens, J.; Hindriks, K.V.; Neerincx, M. Effects of bodily mood expression of a robotic teacher on students. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Chicago, IL, USA, 14–18 September 2014; pp. 2614–2620. [Google Scholar]
McColl, D.; Nejat, G. Determining the Affective Body Language of Older Adults during Socially Assistive HRI. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Chicago, IL, USA, 14–18 September 2014; pp. 2633–2638. [Google Scholar]
McColl, D.; Nejat, G. Affect Detection from Body Language during Social HRI. In Proceedings of the IEEE International Symposium on Robot and Human Interactive Communication, Paris, France, 9–13 September 2012; pp. 1013–1018. [Google Scholar]
McColl, D.; Jiang, G.; Nejat, G. Classifying a Person’s Degree of Accessibility from Natural Body Language During Social Human-Robot Interactions. IEEE Trans. Cybern. 2017, 47, 524–538. [Google Scholar] [CrossRef]
Metallinou, A.; Yang, Z.; Lee, C.C.; Busso, C.; Carnicke, S.; Narayanan, S.S. The USC CreativeIT Database of Multimodal Dyadic Interactions: From Speech and Full Body Motion Capture to Continuous Emotional Annotations. J. Lang. Resour. Eval. 2016, 50, 497–521. [Google Scholar] [CrossRef]
Diehr, P.H.; Thielke, S.M.; Newman, A.B.; Hirsch, C.; Tracy, R. Decline in Health for Older Adults: Five-Year Change in 13 Key Measures of Standardized Health. J. Gerontol. 2013, 68, 1059–1067. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Wilson, J.R.; Scheutz, M.; Briggs, G. Reflections on the Design Challenges Prompted by Affect-Aware Socially Assistive Robots. In Emotions and Personality in Personalized Services; Tkalčič, M., De Carolis, B., de Gemmis, M., Odić, A., Košir, A., Eds.; Springer: Berlin/Heidelberg, Germany, 2016; pp. 377–395. [Google Scholar]
McColl, D.; Nejat, G. Meal-Time With a Socially Assistive Robot and Older Adults at a Long-term Care Facility. J. Hum.-Robot Interact. 2013, 2, 152–171. [Google Scholar] [CrossRef] [Green Version]
McColl, D.; Nejat, G. A Socially Assistive Robot That Can Monitor Affect of the Elderly During Meal-Time Assistance. J. Med. Devices 2014, 8, 030941. [Google Scholar] [CrossRef]
Woiceshyn, L.; Wang, Y.; Nejat, G.; Benhabib, B. Personalized clothing recommendation by a social robot. In Proceedings of the IEEE 5th International Symposium on Robotics and Intelligent Sensors, Ottawa, ON, Canada, 5–7 October 2017; pp. 179–185. [Google Scholar]
Woiceshyn, L.; Wang, Y.; Nejat, G.; Benhabib, B. A Socially assistive robot to help with getting dressed. In Proceedings of the 2017 Design of Medical Devices Conference, Minneapolis, MN, USA, 10–13 April 2017. [Google Scholar]
Hong, A.; Lunscher, N.; Hu, T.; Tsuboi, Y.; Zhang, X.; Alves, S.F.R.; Nejat, G.; Benhabib, B. A Multimodal Emotional Human-Robot Interaction Architecture for Social Robots Engaged in Bidirectional Communication. IEEE Trans. Cybern. 2020, 1–14. [Google Scholar] [CrossRef] [PubMed]
Louie, W.G.; Li, J.; Mohamed, C.; Despond, F.; Lee, V.; Nejat, G. Tangy the Robot Bingo Facilitator: A Performance Review. J. Med. Devices 2015, 9, 020936. [Google Scholar] [CrossRef]
Louie, W.G.; Li, J.; Vaquero, T.; Nejat, G. A Focus group study on the design considerations and impressions of a socially assistive robot for long-term care. In Proceedings of the IEEE International Symposium on Robot and Human Interactive Communication, Edinburgh, UK, 25–29 August 2014; pp. 237–242. [Google Scholar]
Louie, W.G.; Nejat, G. A Social Robot Learning to Facilitate an Assistive Group-Based Activity from Non-expert Caregivers. Int. J. Soc. Robot. 2020, 1–18. [Google Scholar] [CrossRef]
Shu, L.; Xie, J.; Yang, M.; Li, Z.; Li, Z.; Liao, D.; Xu, X.; Yang, X. A Review of Emotion Recognition Using Physiological Signals. Sensors 2018, 18, 2074. [Google Scholar] [CrossRef] [Green Version]
Al-Nafjan, A.; Hosny, M.; Al-Ohali, Y.; Al-Wabil, A. Review and Classification of Emotion Recognition Based on EEG Brain-Computer Interface System Research: A Systematic Review. Appl. Sci. 2017, 7, 1239. [Google Scholar] [CrossRef] [Green Version]
Bradley, M.; Lang, P.J. Measuring Emotion: The Self-Assessment Manikin and the Semantic Differential. J. Behav. Ther. Exp. Psychiatry 1994, 25, 49–59. [Google Scholar] [CrossRef]
Lao, S.; Kawade, M. Vision-based face understanding technologies and their applications. In Proceedings of the Chinese Conference on Advances in Biometric Person Authentication, Guangzhou, China, 13–15 December 2004; pp. 339–348. [Google Scholar]
Kumagai, K.; Baek, J.; Mizuuchi, I. A situation-aware action selection based on individual’s preference using emotion estimation evaluator. In Proceedings of the IEEE International Conference on Robotics and Biomimetics, Bali, Indonesia, 5–10 December 2014; pp. 356–361. [Google Scholar]
Scherer, K.R. What Are Emotions? And How Can They Be Measured? Soc. Sci. Inf. 2005, 44, 695–729. [Google Scholar] [CrossRef]
Deng, E.; Mutlu, B.; Matarić, M.J. Embodiment in Socially Interactive Robots. Found. Trends Robot. 2019, 7, 251–356. [Google Scholar] [CrossRef]
Aviezer, H.; Trope, Y.; Todorov, A. Body Cues, Not Facial Expressions, Discriminate Between Intense Positive and Negative Emotions. Science 2012, 338, 1225–1229. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Shafir, T.; Taylora, S.F.; Atkinsonc, A.P.; Langeneckerd, S.A.; Zubietaa, J.-K. Emotion Regulation Through Execution, Observation, and Imagery of Emotional Movements. Brain Cogn. 2014, 82, 219–227. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Ribeiro, F.S.; Santos, F.H.; Albuquerque, P.B.; Oliveira-Silva, P. Emotional Induction Through Music: Measuring Cardiac and Electrodermal Responses of Emotional States and Their Persistence. Front. Psychol. 2019, 10, 451–463. [Google Scholar] [CrossRef] [Green Version]
Koelsch, S. Towards a Neural Basis of Music-Evoked Emotions. Trends Cogn. Sci. 2010, 14, 131–137. [Google Scholar] [CrossRef]
Lin, Y.-P.; Wang, C.-H.; Jung, T.-P.; Wu, T.-L.; Jeng, S.-K.; Duann, J.-R.; Chen, J.-H. EEG-Based Emotion Recognition in Music Listening. IEEE Trans. Biomed. Eng. 2010, 57, 1798–1806. [Google Scholar]
Koelstra, S.; Muhl, C.; Soleymani, M.; Lee, J.; Yazdani, A.; Ebrahimi, T.; Pun, T.; Nijholt, A.; Patras, I. DEAP: A Database for Emotion Analysis Using Physiological Signals. IEEE Trans. Affect. Comput. 2012, 3, 18–31. [Google Scholar] [CrossRef] [Green Version]
Christensen, J.F.; Gaigg, S.B.; Gomila, A.; Oke, P.; Calvo-merino, B. Enhancing Emotional Experiences to Dance Through Music: The Role of Valence and Arousal in the Cross-Modal Bias. Front. Hum. Neurosci. 2014, 8, 757–765. [Google Scholar] [CrossRef] [Green Version]
Wallbott, H.G. Bodily Expression of Emotion. Eur. J. Soc. Psychol. 1998, 28, 879–896. [Google Scholar] [CrossRef]
Soleymani, M.; Caro, M.N.; Schmidt, E.M.; Sha, C.-Y.; Yang, Y.-H. 1000 Songs for emotional analysis of music. In Proceedings of the 2nd ACM International Workshop on Crowdsourcing for Multimedia, Barcelona, Spain, 22 October 2013; pp. 1–6. [Google Scholar]
Chapados, C.; Levitin, D.J. Cross-modal Interactions in the Experience of Musical Performances: Physiological Correlates. Cognition 2008, 108, 639–651. [Google Scholar] [CrossRef] [Green Version]
Christensen, J.F.; Nadal, M.; Cela-Conde, C.J. A Norming Study and Library of 203 Dance Movements. Perception 2014, 43, 178–206. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Jola, C.; Ehrenberg, S.; Reynolds, D. The Experience of Watching Dance: Phenomenological-Neuroscience Duets. Phenomenol. Cogn. Sci. 2012, 11, 17–37. [Google Scholar] [CrossRef] [Green Version]
Sievers, B.; Polansky, L.; Casey, M.; Wheatley, T. Music and Movement Share a Dynamic Structure That Supports Universal Expressions of Emotion. Proc. Natl. Acad. Sci. USA 2013, 110, 70–75. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Healey, J. Physiological sensing of emotion. In Oxford Handbook of Affective Computing; Calvo, R., D’Mello, S., Gratch, J., Kappas, A., Eds.; Oxford University Press: New York, NY, USA, 2014; pp. 1–20. [Google Scholar]
Girardi, D.; Lanubile, F.; Novielli, N. Emotion Detection using noninvasive low cost sensors. In Proceedings of the International Conference on Affective Computing and Intelligent Interaction, San Antonio, TX, USA, 23–26 October 2017; pp. 125–130. [Google Scholar]
InteraXon Inc. Technical Specifications, Validation, and Research Use; InteraXon Inc.: Toronto, ON, Canada, 2016. [Google Scholar]
Zhao, G.; Zhang, Y.; Ge, Y. Frontal EEG Asymmetry and Middle Line Power Difference in Discrete Emotions. Front. Behav. Neurosci. 2018, 12, 225–239. [Google Scholar] [CrossRef] [Green Version]
Al-Nafjan, A.; Hosny, M.; Al-Wabil, A.; Al-Ohali, Y. Classification of Human Emotions From Electroencephalogram (EEG) Signal Using Deep Neural Network. Int. J. Adv. Comput. Sci. Appl. 2017, 8, 419–425. [Google Scholar] [CrossRef]
Barachant, A.; Morrison, D.; Banville, H.; Kowaleski, J.; Shaked, U.; Chevallier, S.; Tresols, J.J.T. Muse-lsl. Available online: https://github.com/alexandrebarachant/muse-lsl (accessed on 31 May 2020).
Mühl, C.; Allison, B.; Nijholt, A.; Chanel, G. A Survey of Affective Brain Computer Interfaces: Principles, State-Of-The-Art, and Challenges. Brain-Comput. Interfaces 2014, 1, 66–84. [Google Scholar]
Aftanas, L.I.; Varlamov, A.A.; Pavlov, S.V.; Makhnev, V.P.; Reva, N.V. Affective Picture Processing: Event-Related Synchronization Within Individually Defined Human Theta Band Is Modulated by Valence Dimension. Neurosci. Lett. 2001, 303, 115–118. [Google Scholar] [CrossRef]
Reuderink, B.; Mühl, C.; Poel, M. Valence, Arousal and Dominance in the EEG During Game Play. Int. J. Auton. Adapt. Commun. Syst. 2013, 6, 45–62. [Google Scholar] [CrossRef]
Menon, S.; Geethanjali, B.; Seshadri, N.P.G.; Muthumeenakshi, S.; Nair, S. Evaluating the induced emotions on physiological response. In Computational Signal Processing and Analysis; Nandi, A.K., Sujatha, N., Menaka, R., Alex, J.S.R., Eds.; Springer: Singapore, 2018; pp. 211–220. [Google Scholar]
Ramirez, R.; Vamvakousis, Z. Detecting emotion from EEG Signals Using the emotive epoc device. In Proceedings of the International Conference on Brain Informatics, Macau, China, 4–7 December 2012; pp. 175–184. [Google Scholar]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Stemmler, G. Methodological considerations in the psychophysiological study of emotion. In Handbook of Affective Sciences; Davidson, R.J., Scherer, K.R., Goldsmith, H.H., Eds.; Oxford University Press: New York, NY, USA, 2003; pp. 225–255. [Google Scholar]
Lan, Z.; Sourina, O.; Wang, L.; Liu, Y. Real-Time EEG-Based Emotion Monitoring Using Stable Features. Vis. Comput. 2016, 32, 347–358. [Google Scholar] [CrossRef]
Zheng, W.L.; Liu, W.; Lu, Y.; Lu, B.L.; Cichocki, A. Emotionmeter: A Multimodal Framework for Recognizing Human Emotions. IEEE Trans. Cybern. 2019, 49, 1110–1122. [Google Scholar] [CrossRef] [PubMed]
Lin, Y.P.; Yang, Y.H.; Jung, T.P. Fusion of Electroencephalographic Dynamics and Musical Contents for Estimating Emotional Responses in Music Listening. Front. Neurosci. 2014, 8, 1–14. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Dolcos, S.; Katsumi, Y.; Dixon, R.A. The Role of Arousal in the Spontaneous Regulation of Emotions in Healthy Aging: A fMRI Investigation. Front. Psychol. 2014, 5, 1–12. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Pandey, P.; Seeja, K.R. Subject-Independent Emotion Detection From EEG Using VMD and Deep Learning. J. King Saud Univ. Comput. Inf. Sci. 2019. [Google Scholar] [CrossRef]
Katsigiannis, S.; Ramzan, N. DREAMER: A Database for Emotion Recognition Through EEG and ECG Signals From Wireless Low-Cost Off-The-Shelf Devices. IEEE J. Biomed. Health Inform. 2018, 22, 98–107. [Google Scholar] [CrossRef] [Green Version]
Li, X.; Song, D.; Zhang, P.; Zhang, Y.; Hou, Y.; Hu, B. Exploring EEG Features in Cross-Subject Emotion Recognition. Front. Neurosci. 2018, 12, 1–15. [Google Scholar] [CrossRef] [Green Version]
Soleymani, S.; Soleymani, M. Cross-corpus EEG-based emotion recognition. In Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing, Aalborg, Denmark, 17–20 September 2018; pp. 1–6. [Google Scholar]
Abadi, M.K.; Subramanian, R.; Kia, S.M.; Avesani, P.; Patras, I.; Sebe, N. DECAF: MEG-Based Multimodal Database for Decoding Affective Physiological Responses. IEEE Trans. Affect. Comput. 2015, 6, 209–222. [Google Scholar] [CrossRef]
Bradley, A.E. The Use of the Area Under the ROC Curve in the Evaluation of Machine Learning Algorithms. Pattern Recognit. 1997, 30, 1145–1159. [Google Scholar] [CrossRef] [Green Version]

Figure 1. The proposed affect elicitation and affect detection methodology.

Figure 2. Affective expression examples for Pepper for (a) positive valence and high arousal; and (b) negative valence and low arousal.

Figure 3. Muse sensor four electrode locations on the International 10–20 system.

Figure 4. Examples of electroencephalography (EEG) signal in the time domain for the (a) positive valence high arousal (PH) session and (b) negative valence low arousal (NL) session, and the corresponding power spectral density (PSD) in the frequency domain for five consecutive sliding windows for the (c) PH session and (d) NL session for four electrode locations.

Figure 5. Average computed (a) PSD features from the θ, α, β, and γ frequency bands, and (b) frontal EEG asymmetry features based on the frontal α and β band powers for the PH and NL sessions for valence and arousal obtained from the EEG signals in Figure 4, respectively.

Figure 6. Affect elicitation human–robot interaction (HRI) scenario.

Figure 7. Box plots for reported affect for both PH and NL sessions: (a) valence and (b) arousal for all participants; and (c) valence and (d) arousal for each age group, where each box contains the interquartile range (25th to 75th percentile) of the corresponding data, yellow lines represent the median and circles represent outliers.

Figure 8. Receiver operating characteristics (ROC) curve for the neural network (NN) and support vector machine (SVM) models for (a) valence and (b) arousal.

Table 1. List of songs used for affect elicitation.

Affect Type	Song Title	Artist
Positive Valence High Arousal	Tennessee Hayride	Jason Shaw
	Runtime Error	Peter Sharp
	Night Drive	Decktonic
	Songe D’Automne	Latché Swing
	Requiem for a Fish	The Freak Fandango Orchestra
Negative Valence Low Arousal	Eight	Marcel Pequel
	One	Marcel Pequel
	Seven	Marcel Pequel
	Moonlight and Roses	Lee Rosevere
	LA	Julian Winter

Table 2. Average reported valence and arousal.

Age Group	PH Session		NL Session
Age Group	Valence	Arousal	Valence	Arousal
All	1.37 ± 0.68	0.63 ± 1.12	−0.74 ± 1.24	−0.58 ± 0.96
OA	2.00 ± 0.00	1.17 ± 0.98	−0.33 ± 1.96	0.00 ± 0.00
YA	1.08 ± 0.64	0.38 ± 1.12	−0.92 ± 0.75	−0.85 ± 1.07

Table 3. Classification rates for the affect detection models.

Method	Data Set	Valence		Arousal
Method	Data Set	NN	SVM	NN	SVM
Ten-fold Cross-Validation	Training set	71.9%	70.1%	70.6%	69.5%
LOO Cross-Validation	Training set	63.7%	61.8%	63.3%	61.6%
Subject-Independent Testing	Testing set	63.3%	62.4%	62.6%	61.2%

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shao, M.; Snyder, M.; Nejat, G.; Benhabib, B. User Affect Elicitation with a Socially Emotional Robot. Robotics 2020, 9, 44. https://doi.org/10.3390/robotics9020044

AMA Style

Shao M, Snyder M, Nejat G, Benhabib B. User Affect Elicitation with a Socially Emotional Robot. Robotics. 2020; 9(2):44. https://doi.org/10.3390/robotics9020044

Chicago/Turabian Style

Shao, Mingyang, Matt Snyder, Goldie Nejat, and Beno Benhabib. 2020. "User Affect Elicitation with a Socially Emotional Robot" Robotics 9, no. 2: 44. https://doi.org/10.3390/robotics9020044

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

User Affect Elicitation with a Socially Emotional Robot

Abstract

1. Introduction

2. Related Work on Affect Elicitation Using Robots

2.1. Coded Affect

2.2. Self-Reported Affect

2.3. Use of Both Self-Reported and Coded Affect

3. A User Affect Elicitation Methodology Using a Social Robot

3.1. Affect Elicitation

3.2. Affect Detection

3.2.1. Physiological Responses

EEG Feature Extraction

3.2.2. Self-Assessment

3.2.3. Affect Detection Model

4. Experiments

4.1. Affect Elicitation Results

4.2. Affect Detection Models

5. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI