Next Article in Journal
Cohousing IoT: Technology Design for Life in Community
Next Article in Special Issue
Perspective-Taking in Virtual Reality and Reduction of Biases against Minorities
Previous Article in Journal
Multi-Session Influence of Two Modalities of Feedback and Their Order of Presentation on MI-BCI User Training
Previous Article in Special Issue
Intelligent Blended Agents: Reality–Virtuality Interaction with Artificially Intelligent Embodied Virtual Humans
 
 
Article
Peer-Review Record

Building an Emotionally Responsive Avatar with Dynamic Facial Expressions in Human—Computer Interactions

Multimodal Technol. Interact. 2021, 5(3), 13; https://doi.org/10.3390/mti5030013
by Heting Wang 1,†, Vidya Gaddy 2,*,†, James Ross Beveridge 2,† and Francisco R. Ortega 2,*,†
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Multimodal Technol. Interact. 2021, 5(3), 13; https://doi.org/10.3390/mti5030013
Submission received: 29 December 2020 / Revised: 1 March 2021 / Accepted: 16 March 2021 / Published: 20 March 2021
(This article belongs to the Special Issue Social Interaction and Psychology in XR)

Round 1

Reviewer 1 Report

This manuscript presents an interesting idea of testing user experiences with an avatar who shows empathy via dynamic facial expressions. They trained the avatar on socially relevant   cues extracted from human-human interaction. The development of the backend of the avatar’s affective system is quite appealing.

However, in my opinion, in its current form, the empirical studies are not rigorous enough to warrant publication. Most importantly, it is not reported how the user’s perceived the avatar’s emotions. The data for “positive response” is high for the Demo state despite its being the one that got the least positive response for being helpful. Also the users reported more negative responses for “It helped to look at the avatar’s face”. How did the observers perceive empathy in non-verbal situation based on simulated facial expressions, if they did not feel like looking at the avatar’s face? The authors are claiming that “empathic Diana” got most positive responses but it is not clear what is meant by “empathetic”. Additional blocks in the experiment where users look at Diana’s facial expression and classify them is necessary to understand the user’s point of view. As the common saying goes, “beauty is in the eye of the beholder”, so an “empirical human study for Diana’s ability to simulate empathy” cannot be complete unless we know what the humans were perceiving for the 3 modes. An additional appraisal block by users is especially important because the authors chose more complex emotions of concentration, confusion, and sympathy, which varied in percent-rating for the participant pool of experiment-1. As the authors have noted in lines 309-325, there are cultural differences in how humans perceive what was once thought to be globally recognized emotions, therefore, it is crucial to investigate every participant’s perception of avatar’s facial expression for a better understanding of an empathetic avatar. Here is a citation that may be helpful:

“In fact, an avatar who did not smile was perceived as significantly more empathic than one who did.” Guadagno, R. E., Swinth, K. R., & Blascovich, J. (2011). Social evaluations of embodied agents and avatars. Computers in Human Behavior27(6), 2380-2385.

In short, my major concern is no user perceptual test was performed for appraisal of empathy of the avatar, despite the well-researched and well-designed backend.

Minor comments:

Pg.1, Line-25: I do not understand how one can use “socially NON-behavioral cues”, especially for extracting affective metrics.  In my opinion, overt behavior is the only aspect that can be objectively observed and measured.

Pg.3, Line-33: The term “further improve” is not clear. Who or what was improving based on human facial expressions?

Pg.6, Line-178: I would recommend “human behavioral research” instead of “behavioral human research”.

Pg.12, Figures-7-10: The text is cut off. Requires reformatting.

Pg.16, Line-502: Questionnaire about “the amount of facial hair” of participants may sound less funny/odd if a reason is added to it.

Pg.17, Table-2: It seems odd that the positive responses for Demo is reported only for one question, question-3. Why is the data being withheld?

The following lines makes the paper seem weak to me:

Pg.18, Line-567-8:  "It helped to look at the avatar’s face" received much more negative responses than positive responses in all three modes.

Pg.18, Line-573: "The avatar was helping me", mode Demo received the least positive responses, Pg.18, Line-576: However, mode Demo beat the other two modes by having gained more positive responses in the rest of the questions.

 

Pg.20, Line-619: In results section, the authors claim a difference in affect relationship between “signalers” and “builders”. Their interaction also forms the backbone for training the avatar Diana. However, no details of “signaler” and “builders” is provided in the paper – how many dyads were observed and what were the affective metrics extracted, statistical differences between signalers’ and builders’ responses.

Pg.20, Line-625: It is not clear if the participants perceive these emotions “intense joy, frustration and confusion” in human-human interaction or human-avatar interaction.

Pg.20, Line-644: Incomplete sentence? “Due to we asked about…”?

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

The authors have beed developed an avatar who expresses a higher level of affective state empathy. I highly recommend to publish the manuscript. 

As a result of this fact that natural decision-making process in humans cannot adequately model by classical probability and with need more accessible information, I suggest the authors that considering the following papers in cognitive science and AI:

  1. Fell, Lauren, et al. "An experimental protocol to derive and validate a quantum model of decision-making." arXiv preprint arXiv:1908.07935 (2019).
  2. Dehdashti, Shahram, Lauren Fell, and Peter Bruza. "On the irrationality of being in two minds." Entropy 22.2 (2020): 174.
  3. Uprety, Sagar, et al. "Quantum-like structure in multidimensional relevance judgements." European Conference on Information Retrieval. Springer, Cham, 2020.
  4. Gkoumas, Dimitris, et al. "Quantum Cognitively Motivated Decision Fusion for Video Sentiment Analysis." arXiv preprint arXiv:2101.04406 (2021).

Also, Please fix figures 7 and 8. 

Author Response

We were unable to replicate the issue with Figures 7-10 on our systems. In a previous edit, I believe we resolved this overlapping issue. Please let us know if these charts still overlap in the edited version of this document.

We have reviewed the sources you have graciously provided. They are highly reputable and would be great additions to future work being done in the field of Affective computing. 

Thank you very much for your positive review of our manuscript.

Reviewer 3 Report

This paper studied the effect of a natural 3D interface with a human-centered experience. The proposed study simulated empathic facial expressions that were added to the 3D interface; human responses were recorded. Twenty-one users participated in this study, and a statistical analysis was performed.

This paper provides a useful resource for a natural 3D interface design. The procedures of the experiments were well documented.   Limitations were discussed, and suggestions were also provided.  However, twenty-one responses is a limited sample size, which may affect the reliability of the p-value.

What is the null hypothesis of the proposed study?

How does the accuracy of Affdex affect the outcome of the proposed study?

Abstract: There are grammatical errors present in the abstract. Additionally, the sentence “We found that the person who gave instructions was happier than the builder” seems out of place in the flow of the abstract; including some sort of connection between the preceding sentence will allow for greater intelligibility in the abstract.

Lines 16-18: There are grammatical errors, particularly in the use of commas that results in a run-on first sentence.

Line 43: “most frequently occurred” -> most frequently occurring

Line 44: “set Diana” -> set that Diana

Line 542: “a optional” -> an optional

Line 644: “Due to we asked about … ,” not a sentence

Line 439: “The pseudo-code of … is shown below.” The algorithm should be indexed and referenced from the text.

There are other grammatical errors in the paper. Fixing them will increase the readability of the paper.

The captions of Figure 7 and Figure 8 were cut off.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 4 Report

In the attached file.

Comments for author File: Comments.pdf

Author Response

Please see the attachment.

Thank you very much for your review.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Thanks for the your rebuttal. The manuscript is much improved. However, the appraisals of complex facial responses of Diana comes from another group and not the users filling up the questionnaire.

Since the authors are imparting "empathy" to Diana based on facial expressions, I am not convinced by the Response-2... I fail to see how claims about perception of empathy can be made when no appraisals of facial expressions of empathy were measured from the user group.

Author Response

We reviewed our manuscript with your suggestions in mind. We adjusting our phrasing at numerous locations in the article to be more specific when discussing the capabilities of our avatar Diana. We excluded any reference to the umbrella term "empathy" when being used to describe Diana (one instance, when empathic was used as a label was left because it was clearly defined as part of a table). Instead, we used terms that made the specialized state of Diana more clear to readers such as "emotionally intelligent" or "emotionally responsive". We made sure to leave our definition of empathy in the article because several sources we cite used the term "empathy" in their works and we wanted to provide any readers with a specific definition. We also changed the name of the manuscript from "Building an Empathic Avatar with Dynamic Facial Expressions in Human-computer Interaction" to "Building an Emotionally Responsive Avatar with Dynamic Facial Expressions in Human-computer Interaction". If the title change is unnecessary or unwanted please let me know. Thank you for your feedback.

Back to TopTop