Multimodal Technologies and Interaction

Journal Menu

Journal Browser

► Journal Browser

Multimodal Emotion Recognition

Share This Special Issue

Special Issue Editors

Dr. Stylianos (Stelios) Asteriadis

E-Mail Website
Guest Editor

Department of Data Science and Knowledge Engineering, Maastricht University, Maastricht, The Netherlands
Interests: affective computing; human computer interaction; human activity recognition; emotion recognition; computer vision

Dr. Enrique Hortal

E-Mail Website
Guest Editor

Department of Data Science and Knowledge Engineering, Maastricht University, 6229 Maastricht, The Netherlands
Interests: machine learning; deep learning; (bio)signal processing and analysis; medical imaging; electroencephalography
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The goal of automatic emotion recognition is to detect and recognize affect from low-level sensorial cues. The processing of behavioral and emotional signals usually involves facial analysis, body posture, speech/vocalization, as well as biomeasurements and analysis of brain signals. Most of the times, proper emotional models are used for emotion recognition. These can be coarsely divided into two categories: discrete and continuous emotion models. One of the most universally recognized and widely used discrete emotion models is based on the six basic emotions (sadness, happiness, fear, anger, surprise, and disgust), as proposed by Paul Ekman. More recently, the concept of compound emotion (e.g., surprisingly happy or happily disgusted) has also been explored in AI. There are also many studies using dimensional spaces (e.g., valence, arousal), mainly due to the fact that discrete models, although easily applicable in classification problems, do not always fully describe every emotion-enriched experience, and their label-based character limits their applicability in domains where the intensity of the emotion is important.

For a more accurate emotion recognition, many works have proposed multimodal fusion approaches, combining various cues (e.g., visual, audio, wearable devices, brain signals). Efforts in developing related methods, however, often face challenges due to a significant lack of proper datasets for training and testing. Thus, one of the most typical problems faced by researchers in affective computing is that they train their systems on one dataset and, while they achieve accurate results on it, they fail to achieve high accuracies when applying the same model on a different dataset. This becomes even more challenging when data are captured in non-controlled, spontaneous conditions. Regarding AI methods for emotion recognition, various techniques have been proposed, with the most recent studies focusing on end-to-end, deep-learning topologies as a way to take advantage of as much training data as possible, aiming at high accuracies in datasets captured in the wild.

This Special Issue is looking for high-quality research contributions in one or more of the following domains:

New computational models in multimodal emotion recognition, using deep-learning topologies
Personalized emotional models and multimodality
Domain adaptation across datasets in emotion recognition
Domain adaptation across modalities in emotion recognition
New multimodal datasets for emotion recognition

Dr. Stylianos (Stelios) Asteriadis
Dr. Enrique Hortal
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Multimodal Technologies and Interaction is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.

Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.

Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.

External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.

Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (1 paper)

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

21 pages, 7660 KB

Open AccessArticle

A Multimodal Facial Emotion Recognition Framework through the Fusion of Speech with Visible and Infrared Images

by Mohammad Faridul Haque Siddiqui and Ahmad Y. Javaid

Multimodal Technol. Interact. 2020, 4(3), 46; https://doi.org/10.3390/mti4030046 - 6 Aug 2020

Cited by 53 | Viewed by 11965

Abstract

The exigency of emotion recognition is pushing the envelope for meticulous strategies of discerning actual emotions through the use of superior multimodal techniques. This work presents a multimodal automatic emotion recognition (AER) framework capable of differentiating between expressed emotions with high accuracy. The contribution involves implementing an ensemble-based approach for the AER through the fusion of visible images and infrared (IR) images with speech. The framework is implemented in two layers, where the first layer detects emotions using single modalities while the second layer combines the modalities and classifies emotions. Convolutional Neural Networks (CNN) have been used for feature extraction and classification. A hybrid fusion approach comprising early (feature-level) and late (decision-level) fusion, was applied to combine the features and the decisions at different stages. The output of the CNN trained with voice samples of the RAVDESS database was combined with the image classifier’s output using decision-level fusion to obtain the final decision. An accuracy of 86.36% and similar recall (0.86), precision (0.88), and f-measure (0.87) scores were obtained. A comparison with contemporary work endorsed the competitiveness of the framework with the rationale for exclusivity in attaining this accuracy in wild backgrounds and light-invariant conditions. Full article

(This article belongs to the Special Issue Multimodal Emotion Recognition)

► Show Figures

Journal Menu

Journal Browser

Multimodal Emotion Recognition

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (1 paper)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI